{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# `vaex` @ PiterPy Online 2020\n",
    "\n",
    "##  Анализ миллиарда поездок такси в Нью-Йорке (2009-2015)\n",
    "\n",
    "https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page\n",
    "\n",
    "Железо:\n",
    "* MacBook Pro, 2018 года\n",
    "* Процессор: 2.6 GHz 6-Core Intel Core i7\n",
    "* Оперативная память: 32 GB 2400 MHz DDR4\n",
    "* Жесткий диск: SSD, 500 GB"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T00:59:33.066581Z",
     "start_time": "2020-07-31T00:59:30.817708Z"
    }
   },
   "outputs": [],
   "source": [
    "import vaex\n",
    "from vaex.ui.colormaps import cm_plusmin\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import pylab as plt\n",
    "import seaborn as sns\n",
    "\n",
    "import warnings; warnings.simplefilter('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Основные идеи, используемые в `vaex`:\n",
    " - Отображение файлов в память (memory mapping)\n",
    " - Ленивые вычисления (lazy evaluations)\n",
    " - Система выражений (\"виртуальные\" колонки)\n",
    " - Эффективные алгоритмы, параллельное вычисление"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Отображение файлов в память"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T00:59:58.312961Z",
     "start_time": "2020-07-31T00:59:58.160255Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "107G\t/Users/bulatyaminov/fun/datasets/demo/yellow_taxi_2009_2015_f32.hdf5\r\n"
     ]
    }
   ],
   "source": [
    "!du -h /Users/bulatyaminov/fun/datasets/demo/*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:00:29.340975Z",
     "start_time": "2020-07-31T01:00:29.289922Z"
    }
   },
   "outputs": [],
   "source": [
    "df = vaex.open('/Users/bulatyaminov/fun/datasets/demo/yellow_taxi_2009_2015_f32.hdf5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Поддерживается множество форматов данных, в том числе считывание напрямую из S3:\n",
    "```\n",
    "df = vaex.open('s3://vaex/taxi/yellow_taxi_2015_f32s.hdf5?anon=true')\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Ленивые вычисления\n",
    "\n",
    "Считываются только те данные, которые нужны для вычисления."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:01:09.896986Z",
     "start_time": "2020-07-31T01:01:09.841816Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>VTS        </td><td>2009-01-03 15:43:00.000000000</td><td>2009-01-03 15:57:00.000000000</td><td>5                </td><td>Credit        </td><td>10.350000381469727</td><td>-74.0025863647461 </td><td>40.73974609375    </td><td>nan        </td><td>nan                 </td><td>-73.86997985839844 </td><td>40.770225524902344</td><td>23.700000762939453</td><td>0.0        </td><td>nan      </td><td>4.739999771118164 </td><td>0.0           </td><td>28.440000534057617</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,922</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,923</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,924</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,925</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,926</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727\n",
       "2              VTS          2009-01-03 15:43:00.000000000  2009-01-03 15:57:00.000000000  5                  Credit          10.350000381469727  -74.0025863647461   40.73974609375      nan          nan                   -73.86997985839844   40.770225524902344  23.700000762939453  0.0          nan        4.739999771118164   0.0             28.440000534057617\n",
       "3              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453\n",
       "4              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...\n",
       "1,173,057,922  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167\n",
       "1,173,057,923  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863\n",
       "1,173,057,924  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863\n",
       "1,173,057,925  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863\n",
       "1,173,057,926  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Система выражений (\"виртуальные\" колонки)\n",
    "\n",
    "Обращение к колонке - одно из выражений."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:01:56.348574Z",
     "start_time": "2020-07-31T01:01:56.332436Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Expression = tip_amount\n",
       "Length: 1,173,057,927 dtype: float32 (column)\n",
       "---------------------------------------------\n",
       "         0     0\n",
       "         1     2\n",
       "         2  4.74\n",
       "         3  3.05\n",
       "         4     0\n",
       "      ...       \n",
       "1173057922  1.76\n",
       "1173057923     0\n",
       "1173057924     0\n",
       "1173057925     0\n",
       "1173057926  2.96"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tip_amount"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-03T08:13:57.992614Z",
     "start_time": "2020-06-03T08:13:57.989452Z"
    }
   },
   "source": [
    "Создание новой колонки не занимает дополнительную память."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:03:06.081033Z",
     "start_time": "2020-07-31T01:03:06.067314Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Expression = tip_percentage\n",
       "Length: 1,173,057,927 dtype: float32 (column)\n",
       "---------------------------------------------\n",
       "         0         0\n",
       "         1  0.136986\n",
       "         2  0.166667\n",
       "         3  0.165312\n",
       "         4         0\n",
       "        ...         \n",
       "1173057922  0.166667\n",
       "1173057923         0\n",
       "1173057924         0\n",
       "1173057925         0\n",
       "1173057926  0.166667"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['tip_percentage'] = df.tip_amount / df.total_amount\n",
    "df.tip_percentage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "То есть можно сразу увидеть примеры данных в новой колонке."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:03:23.400604Z",
     "start_time": "2020-07-31T01:03:23.374612Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th><th>tip_percentage     </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td><td>0.13698630034923553</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>VTS        </td><td>2009-01-03 15:43:00.000000000</td><td>2009-01-03 15:57:00.000000000</td><td>5                </td><td>Credit        </td><td>10.350000381469727</td><td>-74.0025863647461 </td><td>40.73974609375    </td><td>nan        </td><td>nan                 </td><td>-73.86997985839844 </td><td>40.770225524902344</td><td>23.700000762939453</td><td>0.0        </td><td>nan      </td><td>4.739999771118164 </td><td>0.0           </td><td>28.440000534057617</td><td>0.1666666567325592 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td><td>0.16531164944171906</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td><td>0.0                </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td><td>...                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,922</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td><td>0.1666666567325592 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,923</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,924</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,925</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,926</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td><td>0.1666666716337204 </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount        tip_percentage\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273   0.0\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727  0.13698630034923553\n",
       "2              VTS          2009-01-03 15:43:00.000000000  2009-01-03 15:57:00.000000000  5                  Credit          10.350000381469727  -74.0025863647461   40.73974609375      nan          nan                   -73.86997985839844   40.770225524902344  23.700000762939453  0.0          nan        4.739999771118164   0.0             28.440000534057617  0.1666666567325592\n",
       "3              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453  0.16531164944171906\n",
       "4              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716   0.0\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...                 ...\n",
       "1,173,057,922  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167    0.1666666567325592\n",
       "1,173,057,923  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863   0.0\n",
       "1,173,057,924  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863  0.0\n",
       "1,173,057,925  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863   0.0\n",
       "1,173,057,926  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836  0.1666666716337204"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Vaex не будет вычислять выражение, пока не потребуется:\n",
    " - Если результат операции - новая колонка, то вычисление будет ленивым.\n",
    " - Если результат операции - новая структура данных (число, список), то результат будет вычислен сразу."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:04:29.047768Z",
     "start_time": "2020-07-31T01:04:23.137626Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(inf)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tip_percentage.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Фильтрация создает виртуальную копию таблицы. Данные не копируются!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:04:50.334116Z",
     "start_time": "2020-07-31T01:04:49.233589Z"
    }
   },
   "outputs": [],
   "source": [
    "df_filtered = df[df.total_amount > 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:05:09.775980Z",
     "start_time": "2020-07-31T01:05:01.167336Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(0.07121788)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_filtered.tip_percentage.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Эффективные алгоритмы"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:05:35.995321Z",
     "start_time": "2020-07-31T01:05:35.986666Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of rows: 1,173,057,927\n",
      "Number of columns: 19\n"
     ]
    }
   ],
   "source": [
    "# Check length of file\n",
    "rows, columns = df.shape\n",
    "print(f'Number of rows: {rows:,}')\n",
    "print(f'Number of columns: {columns}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:06:14.290092Z",
     "start_time": "2020-07-31T01:05:42.064553Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pickup_datetime</th>\n",
       "      <th>passenger_count</th>\n",
       "      <th>total_amount</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>dtype</th>\n",
       "      <td>datetime64[ns]</td>\n",
       "      <td>int64</td>\n",
       "      <td>float32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057925</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>NA</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>1970-01-01T00:00:01.953533625</td>\n",
       "      <td>1.6844313554517245</td>\n",
       "      <td>13.31476581420128</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>6.22239e+16</td>\n",
       "      <td>1.33032</td>\n",
       "      <td>1098.43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>2009-01-01T00:00:27.365015552</td>\n",
       "      <td>0</td>\n",
       "      <td>-2.14748e+07</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>2016-01-01T00:00:49.632313344</td>\n",
       "      <td>255</td>\n",
       "      <td>3.95061e+06</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     pickup_datetime     passenger_count       total_amount\n",
       "dtype                 datetime64[ns]               int64            float32\n",
       "count                     1173057927          1173057927         1173057925\n",
       "NA                                 0                   0                  2\n",
       "mean   1970-01-01T00:00:01.953533625  1.6844313554517245  13.31476581420128\n",
       "std                      6.22239e+16             1.33032            1098.43\n",
       "min    2009-01-01T00:00:27.365015552                   0       -2.14748e+07\n",
       "max    2016-01-01T00:00:49.632313344                 255        3.95061e+06"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['pickup_datetime', 'passenger_count', 'total_amount'].describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Когда возможно, используется подход \"map/reduce\". Например, для вычисления `min`:\n",
    "* Данные разделяются на куски.\n",
    "* Процессоры находят `min` для каждого куска.\n",
    "* Берется `min` среди всех кусков.\n",
    "\n",
    "Минимизируется кол-во проходов по данным.\n",
    "* Для внутренних алгоритмов.\n",
    "* Пользователь [тоже имеет контроль](https://docs.vaex.io/en/latest/tutorial.html#Parallel-computations) при помощи `delay=True` и `df.execute()`:\n",
    "```\n",
    "delayed_min = df.min(df.passenger_count, delay=True)\n",
    "delayed_max = df.max(df.passenger_count, delay=True)\n",
    "df.execute()  # single pass through the column\n",
    "print(delayed_min.get(), delayed_max.get())\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Пример: анализ данных поездок Нью-Йоркского такси\n",
    "\n",
    "### Удаление незаполненных строк"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:07:44.058009Z",
     "start_time": "2020-07-31T01:07:43.375421Z"
    }
   },
   "outputs": [],
   "source": [
    "# Drop NANs\n",
    "df = df.dropna(column_names=['dropoff_latitude', 'dropoff_longitude', 'pickup_latitude'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Выбор адекватного количества пассажиров"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:08:12.526276Z",
     "start_time": "2020-07-31T01:07:56.299464Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "844fae50b3bc4494861b4cec7b6afc1f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "1      812315955\n",
       "2      172863547\n",
       "5       81923905\n",
       "3       51435661\n",
       "6       25614703\n",
       "4       24983364\n",
       "0        3903564\n",
       "208         1515\n",
       "7            435\n",
       "9            352\n",
       "8            313\n",
       "49            26\n",
       "10            17\n",
       "255           10\n",
       "129            7\n",
       "dtype: int64"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.passenger_count.value_counts(progress='widget')[:15]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:08:44.263573Z",
     "start_time": "2020-07-31T01:08:43.486404Z"
    }
   },
   "outputs": [],
   "source": [
    "# Filter abnormal number of passengers\n",
    "df = df[(df.passenger_count > 0) & (df.passenger_count < 7)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Проверка длины поездки"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:09:05.597538Z",
     "start_time": "2020-07-31T01:08:53.116824Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "357cd85ecc0c426fb8b8b7095840a3d1",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjgAAAEYCAYAAABRMYxdAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVjUlEQVR4nO3df5TldX3f8ecLMNqoq5DdrOvCZg2hiWuarHYPq2JzNmJbpNE1RhGouFgs2Pgj9pi2NO2JP87JOaRGPUb8AY0EtIiohOx6pGqCGGKCyKKrwFAqxR9hHd0F1IU2MVny7h/3O3Y6zJ25OzP33pnPPB/n3DP3fr+f+/2+d75zZ1/z+X6+30+qCkmSpJYcNe4CJEmSlpoBR5IkNceAI0mSmmPAkSRJzTHgSJKk5hwz7gIGsXbt2tq8efO4y5AkScvMrbfeel9VrZu5fEUEnM2bN7N3795xlyFJkpaZJN+cbbmnqCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5qyIG/1p/D5887fYvW9/3/U7t27k7O2bRliRJEn92YOjgezet5+JyUOzrpuYPDRn+JEkadTswdHAtmxYw9UXPOsRy192yU1jqEaSpP7swZEkSc0x4EiSpOYYcCRJUnMMOJIkqTlDCzhJTkhyQ5KJJHck+Y1u+XFJ/iTJ17qvxw6rBkmStDoNswfnMPDGqtoCPBN4TZItwIXA9VV1EnB991qSJGnJDC3gVNVkVX2pe/4gcCewEdgJXNE1uwJ40bBqkCRJq9NIxuAk2Qw8HbgZWF9Vk92q7wDr+7zn/CR7k+w9ePDgKMqUJEmNGHrASfI44BrgDVX1/90Kt6oKqNneV1WXVtW2qtq2bt26YZcpSZIaMtSAk+RR9MLNlVX1R93i7ybZ0K3fABwYZg2SJGn1GeZVVAE+ANxZVe+YtmoPsKt7vgvYPawaJEnS6jTMuahOAc4Bbkuyr1v2W8BFwEeTnAd8EzhjiDVIkqRVaGgBp6o+D6TP6lOHtV9JkiTvZCxJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNGVrASXJZkgNJbp+27M1J9ifZ1z1OH9b+JUnS6jXMHpzLgdNmWf7OqtraPa4b4v4lSdIqNbSAU1U3Ag8Ma/uSJEn9jGMMzmuTfLU7hXVsv0ZJzk+yN8negwcPjrI+SZK0wo064LwPOBHYCkwCb+/XsKouraptVbVt3bp1IypPkiS1YKQBp6q+W1UPV9XfA/8VOHmU+5ckSavDSANOkg3TXv4qcHu/tpIkSQt1zLA2nOQqYAewNsm9wJuAHUm2AgV8A7hgWPuXJEmr19ACTlWdNcviDwxrf5IkSVO8k7EkSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5RxRwkjw2ydHDKkaSJGkpzBlwkhyV5Owkn0xyAPgfwGSSiSRvS/IzoylTkiRpcPP14NwAnAj8R+BJVXVCVf0k8BzgC8DvJnn5kGuUJEk6IsfMs/55VfV3MxdW1QPANcA1SR41lMokSZIWaM4enKr6uyRnAyQ5s1+bYRQmSZK0UIMMMt6Y5Azg+GEXI0mStBTmG2T8JuA44ErguCS/PZKqJEmSFmG+U1RvAe4HzgHur6q3jqQqSZKkRRjkFNVkVX0E+Pawi5EkSVoK852ielxVXQlQVVf1azOMwiRJkhZqvh6c3UnenuSXkjx2amGSn05yXpJPA6cNt0RJkqQjM+d9cKrq1CSnAxcApyQ5FjgM3AVcB+yqqu8Mv0xJkqTBzXejP6rqOnphRpIkaUUYaLLNJNcPskySJGk5mLMHJ8ljgB8H1nanp9KtWgNsHHJtkiRJCzLfKaoLgDcATwZu5f8FnEPAxcMrS5IkaeHmG2T8LuBdSV5XVe8eUU2SJEmLMu8gY4CqeneSZwObp7+nqj44pLokSZIWbKCAk+RDwInAPuDhbnEBBhxJkrTsDBRwgG3AlqqqQTec5DLgV4ADVfXz3bLjgKvp9QR9Azijqr53JAVLkiTNZ6DLxIHbgScd4bYv55F3Ob4QuL6qTgKu715LkiQtqUF7cNYCE0m+CPxwamFVvbDfG6rqxiSbZyzeCezonl8BfA74DwPWIEmSNJBBA86bl2h/66tqsnv+HWB9v4ZJzgfOB9i0adMS7V6SJK0Gg15F9WdLveOqqiR9x/RU1aXApQDbtm0beOyPJEnSoFM1PJjkUPf4myQPJzm0gP19N8mGbpsbgAML2IYkSdKcBgo4VfX4qlpTVWuAfwD8GvDeBexvD7Cre74L2L2AbUiSJM1p0KuofqR6/hj453O1S3IVcBPws0nuTXIecBHwT5N8DXhe91qSJGlJDXqjvxdPe3kUvfvi/M1c76mqs/qsOnWw0iRJkhZm0KuoXjDt+WF6N+nbueTVSJIkLYFBr6J65bALkSRJWiqDXkV1fJJrkxzoHtckOX7YxUmSJC3EoIOM/5DeFVBP7h6f6JZJkiQtO4MGnHVV9YdVdbh7XA6sG2JdkiRJCzZowLk/ycuTHN09Xg7cP8zCJEmSFmrQgPOvgDPozR81CbwEOHdINUmSJC3KoJeJvxXYVVXfA0hyHPB79IKPJEnSsjJoD84vTIUbgKp6AHj6cEqSJElanEEDzlFJjp160fXgDNr7I0mSNFKDhpS3Azcl+Vj3+qXA7wynJEmSpMUZ9E7GH0yyF3hut+jFVTUxvLIkSZIWbuDTTF2gMdRIkqRlb9AxOJIkSSuGAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5x4y7ALVhYvIQL7vkplnX7dy6kbO3bxpxRZKk1cyAo0XbuXVj33UTk4cADDiSpJEy4GjRzt6+qW+A6derI0nSMDkGR5IkNceAI0mSmmPAkSRJzRnLGJwk3wAeBB4GDlfVtnHUIUmS2jTOQca/XFX3jXH/kiSpUZ6ikiRJzRlXwCngM0luTXL+bA2SnJ9kb5K9Bw8eHHF5kiRpJRtXwHlOVT0DeD7wmiS/NLNBVV1aVduqatu6detGX6EkSVqxxjIGp6r2d18PJLkWOBm4cRy1aPjmmsYBnMpBkrT0Rt6Dk+SxSR4/9Rz4Z8Dto65Do7Fz60a2bFjTd/3E5CF279s/wookSavBOHpw1gPXJpna/4er6lNjqEMjMNc0DtCbysGJOiVJS23kAaeq7gF+cdT71fLkRJ2SpGFwsk2NlRN1SpKGwfvgSJKk5hhwJElSczxFpWXNS8wlSQthwNGyNdcAZHAQsiSpPwOOlq1BLjGXJGk2jsGRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5hwz7gKkxZiYPMTLLrlp1nU7t27k7O2bRlyRJGk5MOBoxdq5dWPfdROThwAMOJK0ShlwtGKdvX1T3wDTr1dHkrQ6OAZHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5q/4y8bd84g4mvn2o73pvFidJ0spjD84cJiYPsXvf/nGXIUmSjtCq78F50wue1nedN4uTJGllsgdHkiQ1Z9X34IzDh2/+1qJOfS10XNBi9jsxeYgtG9Ys6L1avub7mXAMmqSVyh6cMdi9b/+PJoM8UosZF7SY/W7ZsGbOyS21Ms31M+EYNEkrmT04Y7JlwxquvuBZR/y+xY4LWuh+1a5+PxOOQZO0ktmDI0mSmmPAkSRJzRlLwElyWpK7ktyd5MJx1CBJkto18oCT5GjgPcDzgS3AWUm2jLoOSZLUrnEMMj4ZuLuq7gFI8hFgJzAxhlrmNTF5aMkHWy72kuuF1rTaLvUexrFrzXw/E34PJS3WlievmfOmusMyjoCzEfiraa/vBbbPbJTkfOB8gE2bxnMfjmFdFr2YS64XU9NqutR7tfw7F2uunwm/h5JWslTVaHeYvAQ4rape1b0+B9heVa/t955t27bV3r17R1WiJElaIZLcWlXbZi4fxyDj/cAJ014f3y2TJElaEuMIOLcAJyV5SpIfA84E9oyhDkmS1KiRj8GpqsNJXgt8GjgauKyq7hh1HZIkqV1jmaqhqq4DrhvHviVJUvu8k7EkSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOaM/E7GC5HkIPDNWVatBe4bcTl6JI/D8uGxWD48FsuHx2L5GMax+KmqWjdz4YoIOP0k2Tvb7Zk1Wh6H5cNjsXx4LJYPj8XyMcpj4SkqSZLUHAOOJElqzkoPOJeOuwABHoflxGOxfHgslg+PxfIxsmOxosfgSJIkzWal9+BIkiQ9ggFHkiQ1Z8UEnCTHJfmTJF/rvh7bp93DSfZ1jz2jrrNlSU5LcleSu5NcOMv6Rye5ult/c5LNYyhzVRjgWJyb5OC0z8KrxlFn65JcluRAktv7rE+S3++O01eTPGPUNa4WAxyLHUl+MO0z8dujrnE1SHJCkhuSTCS5I8lvzNJmJJ+LFRNwgAuB66vqJOD67vVs/rqqtnaPF46uvLYlORp4D/B8YAtwVpItM5qdB3yvqn4GeCfwu6OtcnUY8FgAXD3ts/AHIy1y9bgcOG2O9c8HTuoe5wPvG0FNq9XlzH0sAP582mfirSOoaTU6DLyxqrYAzwReM8vvp5F8LlZSwNkJXNE9vwJ40fhKWZVOBu6uqnuq6m+Bj9A7JtNNP0YfB05NkhHWuFoMciw0AlV1I/DAHE12Ah+sni8AT0yyYTTVrS4DHAuNQFVNVtWXuucPAncCG2c0G8nnYiUFnPVVNdk9/w6wvk+7xyTZm+QLSV40mtJWhY3AX017fS+P/KH9UZuqOgz8APiJkVS3ugxyLAB+rev+/XiSE0ZTmmYY9FhpNJ6V5CtJ/nuSp427mNZ1wxSeDtw8Y9VIPhfHLPUGFyPJnwJPmmXVf5r+oqoqSb/r23+qqvYn+Wngs0luq6r/tdS1SsvcJ4CrquqHSS6g17P23DHXJI3Tl+j9//BQktOBP6Z3ikRDkORxwDXAG6rq0DhqWFYBp6qe129dku8m2VBVk11X1oE+29jffb0nyefopUcDzuLtB6b3AhzfLZutzb1JjgGeANw/mvJWlXmPRVVN/77/AfBfRlCXHmmQz41GYPp/slV1XZL3JllbVU7CucSSPIpeuLmyqv5oliYj+VyspFNUe4Bd3fNdwO6ZDZIcm+TR3fO1wCnAxMgqbNstwElJnpLkx4Az6R2T6aYfo5cAny3vJDkM8x6LGeezX0jvPLhGbw/wiu6qkWcCP5h2ql0jlORJU2MCk5xM7/8//wBbYt33+APAnVX1jj7NRvK5WFY9OPO4CPhokvOAbwJnACTZBry6ql4FPBW4JMnf0/vhvaiqDDhLoKoOJ3kt8GngaOCyqrojyVuBvVW1h94P9YeS3E1vsN+Z46u4XQMei9cneSG9KxoeAM4dW8ENS3IVsANYm+Re4E3AowCq6v3AdcDpwN3A/wFeOZ5K2zfAsXgJ8G+SHAb+GjjTP8CG4hTgHOC2JPu6Zb8FbILRfi6cqkGSJDVnJZ2ikiRJGogBR5IkNceAI0mSmmPAkSRJzTHgSJKkkZtvgtQZbTd1k3h+ubtD++nzvceAI0mSxuFy5p8gdcp/Bj5aVU+ndwuS9873BgOOpCOS5IlJfn2O9X+5BPs4N8nF3fNXJ3nFHG13JHn2YvcpabRmmyA1yYlJPpXk1iR/nuTnppoDa7rnTwC+Pd/2V9KN/iQtD08Efp0Zf0ElOaaqDlfVkoaN7sZgc9kBPAQsOlhJGrtL6d2892tJttP7PfNc4M3AZ5K8Dngs0Hdqpyn24Eg6UhcBJybZl+SW7q+sPXTToiR5qPu6I8mNST6Z5K4k70/S93dOklcm+Z9JvkjvbqhTy9+c5De7569PMtGdg/9IN1vxq4F/29XzT5K8IMnN3bn6P02yftp2LkvyuST3JHn9tH28otvmV5J8qFu2Lsk13b/xliSnIGlougk6nw18rLsL8iXA1LQzZwGXV9Xx9O6C/KG5fp+APTiSjtyFwM9X1dYkO4BPdq+/Pkvbk4Et9KZX+RTwYuDjMxt1c2e9BfjHwA+AG4Av99n3U7pZ0p9YVd9P8n7goar6vW5bxwLPrKpK8irg3wNv7N7/c8AvA48H7kryPuAf0ju//+yqui/JcV3bdwHvrKrPJ9lEb2qMpw78XZJ0pI4Cvl9VW2dZdx7deJ2quinJY4C19Jl4e2pjkrQYX+wTbqbW3VNVDwNXAc/p02478LmqOlhVfwtc3afdV4Erk7yc3jxbszke+HSS24B/Bzxt2rpPVtUPuxmkDwDr6XV/f2xqVumqmhoT8Dzg4u4vyT3Amu4vTElD0M34/vUkL4XexJ1JfrFb/S3g1G75U4HHAAfn2p4BR9Ji/e851s2c7G6xk9/9C+A9wDOAW5LM1gv9buDiqvpHwAX0fhFO+eG05w8zdy/2UfR6grZ2j41V9dDiypc0pZsg9SbgZ5Pc202m/S+B85J8BbgD2Nk1fyPwr7vlVwHnzjdZqqeoJB2pB+md4hnEyUmeQu8U1cvoDSCczc3Au5L8BHAIeCnwlekNuvPtJ1TVDUk+T+9S0cd19ayZ1vQJwP7u+a4BavwscG2Sd1TV/UmO63pxPgO8Dnhbt/+tVbVvgO1JGkBVndVn1SMuHa+qCaaNzRuEPTiSjkhV3Q/8RXdzrrfN0/wW4GLgTuDrwLV9tjlJ7yqJm4C/6NrPdDTw37pTT18Gfr+qvg98AvjVqUHG3XY+luRW4L4B/j13AL8D/Fn31+E7ulWvB7Z1g48n6A1mlrRCZJ4eHklakG4A8m9W1a+MuRRJq5A9OJIkqTn24EgaqSQ3A4+esficqrptHPVIapMBR5IkNcdTVJIkqTkGHEmS1BwDjiRJao4BR5IkNef/ArYBwL+lfftnAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 576x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(8,4))\n",
    "df.plot1d('trip_distance', limits='minmax', f='log1p', progress='widget')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:09:29.064675Z",
     "start_time": "2020-07-31T01:09:24.651631Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(7844544)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# How many trips have 0.0 distance?\n",
    "(df.trip_distance == 0).astype('int').sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:09:41.725933Z",
     "start_time": "2020-07-31T01:09:39.089325Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fc85b3da84c145f080ae54363a6b887d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Максимальная дистанция среди данных: 198623008.0 мили\n",
      "\n",
      "Это 5.9 от расстояния до Марса!\n"
     ]
    }
   ],
   "source": [
    "# What is the largest distance?\n",
    "trip_distance_max = df.trip_distance.max(progress='widget')\n",
    "print()\n",
    "print(f'Максимальная дистанция среди данных: {trip_distance_max} мили')\n",
    "print()\n",
    "print('Это %1.1f от расстояния до Марса!' % (trip_distance_max / 33_900_000))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:09:55.445667Z",
     "start_time": "2020-07-31T01:09:52.606765Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9a57e8535beb41d5a23f6d47a4a75505",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjgAAAEYCAYAAABRMYxdAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAcYklEQVR4nO3dfbRcd13v8fenT7AuJX0gEWPS2MKqVyNiwGPDg0oRxNALzcUr9EGkIBpYWtSrXq1LV1vKci3Qi/ciIG3FWkFoy4PQKMGigqLSlqQQSnuwEAvSxEBDq40Py0Lhe/+YHe5wMjNncnJm5sye92utWZnZv9+e+e7OzJlPf/u3905VIUmS1CbHTLoASZKk5WbAkSRJrWPAkSRJrWPAkSRJrWPAkSRJrWPAkSRJrTOVASfJNUnuTXLHEH03JPlQko8nuT3JOeOoUZIkTc5UBhzgWmDLkH1/HXhHVT0BOB/43VEVJUmSVoapDDhV9WHg/u5lSR6b5M+S3Jbkb5J8+6HuwKrm/knAP42xVEmSNAHHTbqAZXQ18PKq+kySzXRGan4QuBz4QJJXAI8Anjm5EiVJ0ji0IuAkORF4CvDOJIcWP6z59wLg2qp6bZInA29N8riq+toESpUkSWPQioBDZ1fbv1TVph5tL6WZr1NVNyd5OLAauHd85UmSpHGayjk4C1XVQeCzSZ4PkI7vbpo/DzyjWf4dwMOBAxMpVJIkjUWm8WriSa4DzqYzEvNF4DLgg8CbgLXA8cD1VXVFko3A7wEn0plw/MtV9YFJ1C1JksZjKgOOJEnSIK3YRSVJktRt6iYZr169uk4//fRJlyFJklaA22677UtVtWbh8qkLOKeffjq7du2adBmSJGkFSPKPvZa7i0qSJLWOAUeSJLWOAUeSJLWOAUeSJLWOAUeSJLWOAUeSJLWOAUeSJLWOAUeSJLXO1J3obxq9/dbPc+PufX3bt25ax4WbN4yxIkmS2s0RnDG4cfc+5vcf7Nk2v//gwPAjSZKO3MhGcJJcAzwHuLeqHjeg3/cCNwPnV9W7RlXPpG1cu4obXvbkw5afd9XNE6hGkqR2G+UIzrXAlkEdkhwLvAb4wAjrkCRJM2ZkAaeqPgzcv0i3VwDvBu4dVR2SJGn2TGyScZJ1wPOApwPfu0jfbcA2gA0bVt5k3MUmEc/vP8jGtavGWJEkSbNtkpOM/y/wK1X1tcU6VtXVVTVXVXNr1qwZfWVHaNAkYujMv9m6ad0YK5IkabZN8jDxOeD6JACrgXOSPFRV751gTUvWbxKxJEkav4kFnKo649D9JNcCfzqt4UaSJK0sozxM/DrgbGB1kr3AZcDxAFV15aheV5IkaWQBp6ouOIK+Lx5VHZIkafZ4JmNJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6BhxJktQ6x026AMH8/oOcd9XNPdu2blrHhZs3jLkiSZKmmwFnwrZuWte3bX7/QQADjiRJR8iAM2EXbt7QN8D0G9WRJEmDOQdHkiS1jgFHkiS1zsgCTpJrktyb5I4+7T+W5PYkn0zykSTfPapaJEnSbBnlCM61wJYB7Z8FnlZV3wW8Crh6hLVIkqQZMrJJxlX14SSnD2j/SNfDW4D1o6pFkiTNlpUyB+elwPv7NSbZlmRXkl0HDhwYY1mSJGkaTTzgJHk6nYDzK/36VNXVVTVXVXNr1qwZX3GSJGkqTfQ8OEkeD7wZeHZV3TfJWiRJUntMbAQnyQbgj4Efr6pPT6oOSZLUPiMbwUlyHXA2sDrJXuAy4HiAqroSuBR4FPC7SQAeqqq5UdUjSZJmxyiPorpgkfafBH5yVK8vSZJm18QnGUuSJC03A44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWodA44kSWqd4yZdgAab33+Q8666uW/71k3ruHDzhjFWJEnSymfAWcG2blo3sH1+/0EAA44kSQuMLOAkuQZ4DnBvVT2uR3uA1wHnAP8BvLiqPjaqeqbRhZs3DAwvg0Z2JEmaZaOcg3MtsGVA+7OBM5vbNuBNI6xFkiTNkJEFnKr6MHD/gC5bgbdUxy3AyUnWjqoeSZI0OyZ5FNU64J6ux3ubZYdJsi3JriS7Dhw4MJbiJEnS9JqKw8Sr6uqqmququTVr1ky6HEmStMJNMuDsA07rery+WSZJknRUJhlwtgMvSseTgAeqav8E65EkSS0xysPErwPOBlYn2QtcBhwPUFVXAjvoHCK+h85h4i8ZVS2SJGm2jCzgVNUFi7QX8DOjen1JkjS7pmKSsSRJ0pEw4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNYx4EiSpNY5ooCT5BFJjh1VMZIkScvhuEGNSY4Bzgd+DPhe4EHgYUm+BLwPuKqq9oy8SvU1v/8g5111c8+2rZvWceHmDWOuSJKkyRsYcIAPAX8B/CpwR1V9DSDJqcDTgdckeU9V/dFoy1QvWzet69s2v/8ggAFHkjSTFgs4z6yqryxcWFX3A+8G3p3k+H4rJ9kCvA44FnhzVb16QfsG4A+Bk5s+l1TVjiPagjF5+62f58bd+3q2ze8/yMa1q8ZcUSe89Asw/UZ1JEmaBQPn4FTVV5JcCJDk/H59ei1v5uq8EXg2sBG4IMnGBd1+HXhHVT2Bzq6w3z2y8sfnxt37vj4qstDGtasGjqZIkqTxWmwEB2BdkhcA64/wuc8C9lTV3QBJrge2AvNdfQo4NPRxEvBPR/gaY7Vx7SpueNmTJ12GJElaxMARnCSXAacCbwNOTXLpETz3OuCersd7m2XdLgdemGQvsAN4RZ86tiXZlWTXgQMHjqAESZI0ixbbRfVK4D7gx4H7quqKZX79C4Brq2o9cA7w1ubIrYV1XF1Vc1U1t2bNmmUuQZIktc0w58HZX1XXc+S7j/YBp3U9Xt8s6/ZS4B0AVXUz8HBg9RG+jiRJ0jdYbBfViVX1NoCquq5fnz6r7wTOTHJGkhPoTCLevqDP54FnNM/zHXQCjvugJEnSUVlsBOfGJK9N8gNJHnFoYZLHJHlpkpuALb1WrKqHgIuBm4BP0Tla6s4kVyQ5t+n2i8BPJfkEcB3w4qqqo90oSZI02wYeRVVVz0hyDvAy4KlJTgEeAu6iMyn4oqr6woD1dzT9updd2nV/Hnjq0suXJEk63KKHifcKKZIkSSvZUBfbTPKXwyyTJElaCRa72ObDgf8CrG52T6VpWsXh57SRJElaERbbRfUy4OeBbwFu4/8HnIPAG0ZXliRJ0tItNsn4dcDrkryiql4/ppokSZKOyjDXoqKqXp/kKcDp3etU1VtGVJckSdKSDRVwkrwVeCywG/hqs7gAA44kSVpxhgo4wByw0ZPwSZKkaTDUYeLAHcA3j7IQSZKk5TLsCM5qYD7JR4EHDy2sqnP7ryJJkjQZwwacy0dZhCRJ0nIa9iiqvx51IZIkSctl2KOo/pXOUVMAJwDHA/9eVatGVZgkSdJSDTuC88hD95ME2Ao8aVRFaXnM7z/IeVfd3Ld966Z1XLh5wxgrkiRpPIY9iurrquO9wA8vfzlaLls3rWPj2v4DbPP7D3Lj7n1jrEiSpPEZdhfVj3Q9PIbOeXH+cyQVaVlcuHnDwNGZQSM7kiRNu2GPonpu1/2HgM/R2U0lSZK04gw7B+cloy5EkiRpuQw1ByfJ+iTvSXJvc3t3kvWjLk6SJGkpht1F9QfA24HnN49f2Cz7oVEUpfEYdJSVR1hJkqbZsEdRramqP6iqh5rbtcCaxVZKsiXJXUn2JLmkT58XJJlPcmeStx9B7ToKg46y8ggrSdK0G3YE574kLwSuax5fANw3aIUkxwJvpDPKsxfYmWR7Vc139TkT+FXgqVX1z0m+6Ug3QEsz6Cgrj7CSJE27YUdwfgJ4AfAFYD/wo8CLF1nnLGBPVd1dVV8GrufwI69+CnhjVf0zQFXdO2Q9kiRJfQ0bcK4ALqqqNVX1TXQCzysXWWcdcE/X473Nsm7fBnxbkr9LckuSLb2eKMm2JLuS7Dpw4MCQJUuSpFk1bMB5/KFRFoCquh94wjK8/nHAmcDZdHZ7/V6Skxd2qqqrq2ququbWrFl06o8kSZpxwwacY5KccuhBklNZfP7OPuC0rsfrm2Xd9gLbq+orVfVZ4NN0Ao8kSdKSDRtwXgvcnORVSV4FfAT4zUXW2QmcmeSMJCcA5wPbF/R5L53RG5KsprPL6u4ha5IkSepp2DMZvyXJLuAHm0U/0n00VJ91HkpyMXATcCxwTVXdmeQKYFdVbW/anpVkHvgq8L+qauDRWZIkSYsZ9jBxmkAzMNT0WGcHsGPBsku77hfwC81NkiRpWQy7i0qSJGlqGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrHDfpArQyze8/yHlX3dyzbeumdVy4ecOYK5IkaXgjHcFJsiXJXUn2JLlkQL//kaSSzI2yHg1n66Z1bFy7qmfb/P6D3Lh735grkiTpyIxsBCfJscAbgR8C9gI7k2yvqvkF/R4J/Bxw66hq0ZG5cPOGviM0/UZ1JElaSUY5gnMWsKeq7q6qLwPXA1t79HsV8BrgP0dYiyRJmiGjDDjrgHu6Hu9tln1dkicCp1XV+0ZYhyRJmjETO4oqyTHAbwO/OETfbUl2Jdl14MCB0RcnSZKm2igDzj7gtK7H65tlhzwSeBzwV0k+BzwJ2N5ronFVXV1Vc1U1t2bNmhGWLEmS2mCUAWcncGaSM5KcAJwPbD/UWFUPVNXqqjq9qk4HbgHOrapdI6xJkiTNgJEFnKp6CLgYuAn4FPCOqrozyRVJzh3V60qSJI30RH9VtQPYsWDZpX36nj3KWiRJ0uzwUg2SJKl1vFSDjtigyziAl3KQJE2eAUdHZOumdQPb5/cfBDDgSJImyoCjIzLoMg7gpRwkSSuDc3AkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLrGHAkSVLreJi4lt2gEwF6EkBJ0jgYcLSsBp0I0JMASpLGxYDT5ZV/cifz/3SwZ9v8/oNsXLtqzBVNn0EnAvQkgJKkcXEOzpA2rl216GUKJEnSyuAITpfLnvudky5BkiQtA0dwJElS6ziCo7HyCCtJ0jgYcDQ2HmElSRoXA47GxiOsJEnj4hwcSZLUOiMNOEm2JLkryZ4kl/Ro/4Uk80luT/KXSb51lPVIkqTZMLKAk+RY4I3As4GNwAVJNi7o9nFgrqoeD7wL+M1R1SNJkmbHKEdwzgL2VNXdVfVl4Hpga3eHqvpQVf1H8/AWYP0I65EkSTNilJOM1wH3dD3eC2we0P+lwPt7NSTZBmwD2LDBo2zaatAh5OBh5JKk4a2Io6iSvBCYA57Wq72qrgauBpibm6sxlqYxWewyGB5GLkk6EqMMOPuA07oer2+WfYMkzwR+DXhaVT04wnq0gg06hBw8jFySdGRGOQdnJ3BmkjOSnACcD2zv7pDkCcBVwLlVde8Ia5EkSTNkZCM4VfVQkouBm4BjgWuq6s4kVwC7qmo78FvAicA7kwB8vqrOHVVNmm5e5kGSNKyRzsGpqh3AjgXLLu26/8xRvr7aw8s8SJKOxIqYZCwtZqmXeXj7rZ/nxt2HTf36Bo7+SFL7GHDUCv12X9362fsB2HzGqX3XA0d/JKltDDiaeoN2X20+49SBIzQenSVJ7WTA0dRb7BBzSdLs8WrikiSpdQw4kiSpddxFpZnn+XUkqX0MOJppnl9HktopVdN17cq5ubnatWvXpMvQDDjvqpuZ33+QjWtX9Wx3dEeSJi/JbVU1t3C5IzhSH47uSNL0cgRHWoLFRnfAER5JGgdHcKRlNGh0BzpnUL71s/f3vUyE4UeSRsuAIy3BYicXHHQNLHdvSdLouYtKGjMnL0vS8nEXlbRCDNq9tdiurUPr9wpAi1053eAkaZYYcKQxG7R7a7GQMigADbpy+ih3iy1W8yBLDV2GOUmLcReVNEWW+sM+zFFfSzUoWI1ivcXWPbSdN7zsyUf8vJKmT79dVAYcaQYczSjLMJYyYnK0NY0qzDn6I00XA46kmXA0weloRpXAcKThDPMZ9bM0PAOOJC1ikuFoqfwhnJylfl4W+6wczWdpFj8PEwk4SbYArwOOBd5cVa9e0P4w4C3A9wD3AedV1ecGPacBR9JKNOrdgL2MMlS16YdyVO/NqILIqILT0VjJn4exB5wkxwKfBn4I2AvsBC6oqvmuPj8NPL6qXp7kfOB5VXXeoOc14EhSx0r84V6JZuWHf6V+HjZ+yyoue+53LmdJ32AS58E5C9hTVXc3BVwPbAXmu/psBS5v7r8LeEOS1LTtN5OkCVjsjNpLNYnRqFHafMapKyqIjIqfh280yoCzDrin6/FeYHO/PlX1UJIHgEcBX+rulGQbsA1gw4Z2f0AladJG9UOp6TStn4djJl3AMKrq6qqaq6q5NWvWTLocSZK0wo0y4OwDTut6vL5Z1rNPkuOAk+hMNpYkSVqyUQacncCZSc5IcgJwPrB9QZ/twEXN/R8FPuj8G0mSdLRGNgenmVNzMXATncPEr6mqO5NcAeyqqu3A7wNvTbIHuJ9OCJIkSToqI73YZlXtAHYsWHZp1/3/BJ4/yhokSdLsmYpJxpIkSUfCgCNJklrHgCNJklpn6i62meQA8I8jfInVLDjRYIvNyrbOynaC29pWs7Kts7Kd4LYup2+tqsNOkjd1AWfUkuzqdU2LNpqVbZ2V7QS3ta1mZVtnZTvBbR0Hd1FJkqTWMeBIkqTWMeAc7upJFzBGs7Kts7Kd4La21axs66xsJ7itI+ccHEmS1DqO4EiSpNYx4EiSpNaZyYCTZEuSu5LsSXJJj/aHJbmhab81yekTKPOoJTktyYeSzCe5M8nP9ehzdpIHkuxubpf2eq5pkORzST7ZbMeuHu1J8jvN+3p7kidOos6jleS/dr1fu5McTPLzC/pM7fua5Jok9ya5o2vZqUn+PMlnmn9P6bPuRU2fzyS5aHxVL02fbf2tJH/ffEbfk+TkPusO/LyvJH228/Ik+7o+o+f0WXfg3+uVps+23tC1nZ9LsrvPulPznkL/35gV832tqpm60bmy+T8AjwFOAD4BbFzQ56eBK5v75wM3TLruJW7rWuCJzf1HAp/usa1nA3866VqXaXs/B6we0H4O8H4gwJOAWydd8zJs87HAF+ic6KoV7yvwA8ATgTu6lv0mcElz/xLgNT3WOxW4u/n3lOb+KZPeniVs67OA45r7r+m1rU3bwM/7Srr12c7LgV9aZL1F/16vtFuvbV3Q/lrg0ml/T5t6e/7GrJTv6yyO4JwF7Kmqu6vqy8D1wNYFfbYCf9jcfxfwjCQZY43Loqr2V9XHmvv/CnwKWDfZqiZqK/CW6rgFODnJ2kkXdZSeAfxDVY3y7N5jVVUfBu5fsLj7O/mHwH/vseoPA39eVfdX1T8Dfw5sGVWdy6HXtlbVB6rqoebhLcD6sRe2zPq8p8MY5u/1ijJoW5vfkRcA1421qBEZ8BuzIr6vsxhw1gH3dD3ey+E/+l/v0/yheQB41FiqG5FmN9sTgFt7ND85ySeSvD/Jd463smVVwAeS3JZkW4/2Yd77aXM+/f9YtuV9BXh0Ve1v7n8BeHSPPm18f3+CzqhjL4t93qfBxc2uuGv67MZo23v6/cAXq+ozfdqn9j1d8BuzIr6vsxhwZk6SE4F3Az9fVQcXNH+Mzu6N7wZeD7x3zOUtp++rqicCzwZ+JskPTLqgUUpyAnAu8M4ezW16X79Bdca3W39+iyS/BjwEvK1Pl2n/vL8JeCywCdhPZ9dN213A4NGbqXxPB/3GTPL7OosBZx9wWtfj9c2ynn2SHAecBNw3luqWWZLj6Xzw3lZVf7ywvaoOVtW/Nfd3AMcnWT3mMpdFVe1r/r0XeA+d4e1uw7z30+TZwMeq6osLG9r0vja+eGh3YvPvvT36tOb9TfJi4DnAjzU/EIcZ4vO+olXVF6vqq1X1NeD36F1/m97T44AfAW7o12ca39M+vzEr4vs6iwFnJ3BmkjOa/wM+H9i+oM924NCM7h8FPtjvj8xK1uzv/X3gU1X12336fPOh+UVJzqLzmZi6MJfkEUkeeeg+nYmadyzoth14UTqeBDzQNYw6jfr+32Bb3tcu3d/Ji4Abe/S5CXhWklOa3R3PapZNlSRbgF8Gzq2q/+jTZ5jP+4q2YP7b8+hd/zB/r6fFM4G/r6q9vRqn8T0d8BuzMr6vk56FPYkbnaNpPk1ndv6vNcuuoPMHBeDhdIb99wAfBR4z6ZqXuJ3fR2do8HZgd3M7B3g58PKmz8XAnXSOTrgFeMqk617itj6m2YZPNNtz6H3t3tYAb2ze908Cc5Ou+yi29xF0AstJXcta8b7SCW37ga/Q2S//Ujpz4P4S+AzwF8CpTd854M1d6/5E873dA7xk0tuyxG3dQ2duwqHv7KEjOr8F2NHc7/l5X6m3Ptv51uZ7eDudH8S1C7ezeXzY3+uVfOu1rc3yaw99P7v6Tu172tTc7zdmRXxfvVSDJElqnVncRSVJklrOgCNJklrHgCNJklrHgCNJklrHgCNJklrHgCNJklrHgCNpSZKcnOSnB7R/ZBle48VJ3tDcf3mSFw3oe3aSpxzta0pqBwOOpKU6GTgs4DSnpKeqljVsVNWVVfWWAV3OBgw4kgADjqSlezXw2CS7k+xM8jdJtgPzAEn+rfn37CQfTvK+JHcluTJJ3789SV6S5NNJPgo8tWv55Ul+qbn/s0nmmytRX99cyfjlwP9s6vn+JM9NcmuSjyf5iySP7nqea5L8VZK7k/xs12u8qHnOTyR5a7NsTZJ3N9u4M8lTkbTiHTfpAiRNrUuAx1XVpiRnA+9rHn+2R9+zgI3APwJ/Rueig+9a2Km5PtErge8BHgA+BHy8z2ufUVUPJjm5qv4lyZXAv1XV/26e6xTgSVVVSX6SzvWdfrFZ/9uBpwOPBO5K8ibg24Bfp3NZiy8lObXp+zrg/1TV3ybZQOd6Od8x9H8lSRNhwJG0XD7aJ9wcarsbIMl1dK5hc1jAATYDf1VVB5q+N9AJHgvdDrwtyXuB9/Z5zfXADU1oOgHoru19VfUg8GCSe4FHAz8IvLOqvgRQVfc3fZ8JbGyuXQqwKsmJ1VytXdLK5C4qScvl3we0Lbzo3dFeBO+/0blw6hOBnYfm/SzweuANVfVdwMvoXET3kAe77n+Vwf+zdwydkaBNzW2d4UZa+Qw4kpbqX+ns4hnGWUnOaObenAf8bZ9+twJPS/KoJMcDz1/YoXmO06rqQ8CvACcBJ/ao5yRgX3P/oiFq/CDw/CSPal7n0C6qDwCv6Hr9TUM8l6QJM+BIWpKqug/4uyR3AL+1SPedwBuAT9HZVfSePs+5H7gcuBn4u6b/QscCf5Tkk3Tm5/xOVf0L8CfA8w5NMm6e551JbgO+NMT23An8BvDXST4B/HbT9LPAXDP5eJ7OZGZJK1yqjnakWJL6ayYg/1JVPWfCpUiaIY7gSJKk1nEER9JEJLkVeNiCxT9eVZ+cRD2S2sWAI0mSWsddVJIkqXUMOJIkqXUMOJIkqXUMOJIkqXX+H3WMHzFfShfGAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 576x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(8,4))\n",
    "df.plot1d('trip_distance', limits=[0, 20], f=None, progress='widget')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:10:07.084134Z",
     "start_time": "2020-07-31T01:10:06.431245Z"
    }
   },
   "outputs": [],
   "source": [
    "# Filter negative and too large distances\n",
    "df = df[(df.trip_distance > 0) & (df.trip_distance < 10)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Где чаще всего начинаются поездки?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:10:28.439470Z",
     "start_time": "2020-07-31T01:10:17.156232Z"
    },
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ef4495eb151842a78c92fa66351c544e",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Heatmap(children=[ToolsToolbar(interact_value=None, supports_normalize=False, template='<template>\\n  <v-toolb…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Interactively plot the pickup locations\n",
    "df.plot_widget(df.pickup_longitude, \n",
    "               df.pickup_latitude, \n",
    "               shape=512, \n",
    "               f='log1p', \n",
    "               colormap='plasma', \n",
    "               limits='minmax')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:12:53.224631Z",
     "start_time": "2020-07-31T01:12:52.515996Z"
    }
   },
   "outputs": [],
   "source": [
    "# Define the NYC boundaries\n",
    "long_min = -74.05\n",
    "long_max = -73.75\n",
    "lat_min = 40.58\n",
    "lat_max = 40.90\n",
    "\n",
    "# Make a selection based on the boundaries\n",
    "df = df[(df.pickup_longitude > long_min)  & (df.pickup_longitude < long_max) & \\\n",
    "        (df.pickup_latitude > lat_min)    & (df.pickup_latitude < lat_max) & \\\n",
    "        (df.dropoff_longitude > long_min) & (df.dropoff_longitude < long_max) & \\\n",
    "        (df.dropoff_latitude > lat_min)   & (df.dropoff_latitude < lat_max)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Обработка дат"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:14:00.807821Z",
     "start_time": "2020-07-31T01:13:01.435727Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th><th>tip_percentage     </th><th>pickup_hour  </th><th>pickup_day_of_week  </th><th>pickup_is_weekend  </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td><td>0.0                </td><td>2            </td><td>6                   </td><td>1                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td><td>0.13698630034923553</td><td>3            </td><td>6                   </td><td>1                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td><td>0.16531164944171906</td><td>20           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td><td>0.0                </td><td>16           </td><td>5                   </td><td>1                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-16 22:35:59.000000000</td><td>2009-01-16 22:43:35.000000000</td><td>2                </td><td>CASH          </td><td>1.2000000476837158</td><td>-73.98980712890625</td><td>40.73500442504883 </td><td>nan        </td><td>nan                 </td><td>-73.98502349853516 </td><td>40.72449493408203 </td><td>6.099999904632568 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>6.599999904632568 </td><td>0.0                </td><td>22           </td><td>4                   </td><td>0                  </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td><td>...                </td><td>...          </td><td>...                 </td><td>...                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,545</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td><td>0.1666666567325592 </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,546</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,547</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,548</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,549</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td><td>0.1666666716337204 </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount        tip_percentage       pickup_hour    pickup_day_of_week    pickup_is_weekend\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273   0.0                  2              6                     1\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727  0.13698630034923553  3              6                     1\n",
       "2              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453  0.16531164944171906  20             3                     0\n",
       "3              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716   0.0                  16             5                     1\n",
       "4              DDS          2009-01-16 22:35:59.000000000  2009-01-16 22:43:35.000000000  2                  CASH            1.2000000476837158  -73.98980712890625  40.73500442504883   nan          nan                   -73.98502349853516   40.72449493408203   6.099999904632568   0.5          nan        0.0                 0.0             6.599999904632568   0.0                  22             4                     0\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...                 ...                  ...            ...                   ...\n",
       "1,083,167,545  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167    0.1666666567325592   23             3                     0\n",
       "1,083,167,546  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863   0.0                  23             3                     0\n",
       "1,083,167,547  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863  0.0                  23             3                     0\n",
       "1,083,167,548  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863   0.0                  23             3                     0\n",
       "1,083,167,549  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836  0.1666666716337204   23             3                     0"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Daily activities\n",
    "df['pickup_hour'] = df.pickup_datetime.dt.hour\n",
    "df['pickup_day_of_week'] = df.pickup_datetime.dt.dayofweek\n",
    "df['pickup_is_weekend'] = (df.pickup_day_of_week >= 5).astype('int')\n",
    "\n",
    "# Treat as a categorical feature\n",
    "df.categorize(column='pickup_hour', inplace=True)\n",
    "\n",
    "weekday_names_list = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']\n",
    "df.categorize(column='pickup_day_of_week', labels=weekday_names_list, inplace=True)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:14:34.551591Z",
     "start_time": "2020-07-31T01:14:23.361462Z"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA8sAAAFgCAYAAACMteurAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAvtklEQVR4nO3deZhkdX3v8fenexiGfXFwQVYJoogGcSKoUYmRXDBGI6IXjElQE8yNZrlm0WzuWUw0uTFqDOYi1yXiroi4RUFEURnABVCUoAIRhVFQEIZZ+nv/qDNSXdPTXdXTVaeKfr+e5zxz6tSpOp+q6T5d3/otJ1WFJEmSJEm6y1TbASRJkiRJGjcWy5IkSZIk9bBYliRJkiSph8WyJEmSJEk9LJYlSZIkSephsSxJkiRJUg+LZUmSJEnSvJKckeTGJJf3se8/JflSs3wjyS0jiLjk4nWWJUmSJEnzSfIY4DbgLVV1xACP+z3goVX17KGFGxJbliVJkiRJ86qqC4Afdm9LckiSjya5JMlnkjxgjoeeArxjJCGX2Iq2A0iSJEmSJtLpwO9U1TeTHA28AXjcljuTHAgcDHyqpXzbxWJZkiRJkjSQJLsCjwTenWTL5h17djsZeE9VbR5ltqVisSxJkiRJGtQUcEtVHTnPPicDzxtNnKXnmGVJkiRJ0kCq6sfAt5I8DSAdP7vl/mb88l7ARS1F3G4Wy5IkSZKkeSV5B53C97Ak1yd5DvBrwHOSfBm4Anhy10NOBs6qCb78kpeOkiRJkiSphy3LkiRJkiT1mOgJvvZaubL23WlV2zFYsct4vI0r9tql7Qgdq/ZoOwEAP9rQOxlfO25Yd1vbEQComfHoRTK1YrrtCADsuvPKtiMAsPcuO7QdgZ2m17cdAYBs/EnbEQCojRvbjjBe7prhtFWZHo+/tUy3/zsLUJs3tB0BgNo4HjkyPR5/W7JiPD57MDUe7wcZj9/bjbRfLwB85bJL11XVPm3nWApHrtqrbp1Z3N/Lazb+5GNVdfwSRxqK8fgJXqR9d1rFWY84uu0Y3OPoe7QdAYDVJ7b/XgDkAb/cdgQAPnTtAW1HAOBv3jwecxpsvv3OtiMAsPM9x+PLlGOOHI+fj1N+7t5tR+DI3b7RdgQA8v21bUcAYOP3rm87wljJDuNRHK7Yc3XbEQCY2uM+bUcAYPPN17UdAYAN3x2P35cVe4zH35YV9zqo7QgAZKe9244AwMzK8fi9vZFD2o4AwL477/SdtjMslVtnNvJ39zpyUY99+vWfHY8fjD5MdLEsSZIkSRqtANPj0fFoqCyWJUmSJEn9S5iauvtXyxbLkiRJkqSBTC2DqaKXwUuUJEmSJGkwtixLkiRJkvrmmGVJkiRJknoFxyxLkiRJktQtLI8xyxbLkiRJkqT+xWJZkiRJkqRZAkzFbtiSJEmSJM1iy7IkSZIkSd2WyQRfy+D7AEmSJEmSBmPLsiRJkiSpbwGyDJpdLZYlSZIkSQNxzLIkSZIkSd2WyZhli2VJkiRJUt/shi1JkiRJUq9kWbQsj+T7gCS7JflQkvOTXJTkhFEcV5IkSZK09DK1uGXB503OSHJjksu3cX+SvDbJ1Um+kuSopX5tW4yq8fw3gI9W1bHAI4GLRnRcSZIkSdLkOBM4fp77TwAObZbTgH8dVpBRFct3AMckuVd13JJk7ZY7t6wnOTPJG5N8IskHktz92/YlSZIkaYKEzgRfi1kWUlUXAD+cZ5cnA29p6srPA3smuc/SvLLZRlUsvxW4CvhY0w37sHn2/VxVHQfcCTy4984kpyVZm2TtzRs2DimuJEmSJGlO2a5u2Ku31HPNctqAR78vcF3X7eubbUtuJBN8VdVG4JXAK5McB7ysZ5furxgua/69Dthrjuc6HTgd4EF77F5Ln1aSJEmSNJ8sfoKvdVW1ZimzDMtIiuUkBwI3VNUG4EY6xfGqJNN0vgXoLoq7C2C7YUuSJEnSOAlMtXfpqP8G9u+6vV+zbcmN6tJRDwbemWQ9nQL4ecCv0Jno6wLglhHlkCRJkiRthwCZbq1d82zg+UnOAo4GflRVNwzjQKPqhn0OcE7P5suBv+3Z79Su9T8efjJJkiRJ0kDS32WgFvXUyTuAY+mMbb4eeAmwA0BVvRE4F3gCcDVwO/Cs4SQZXcuyJEmSJEnzqqpTFri/6PRUHjqLZUmSJEnSQLZjgq+JYbEsSZIkSRpALJYlSZIkSZpliGOWx4nFsiRJkiSpb8Fu2JIkSZIkzRbIdNshhs9iWZIkSZI0EFuWJUmSJEnqluVRLC+DYdmSJEmSJA3GlmVJkiRJ0kCcDVuSJEmSpC4JZPru3w3bYlmSJEmSNIAsizHLFsuSJEmSpP7FbtiSJEmSJG3NlmVJkiRJkroskzHLy6DxXJIkSZKkwUx0y/LUirBq9cq2Y7DzA/dpOwIAK+5x77YjAHAL45Hj2h+ubzsCADMbNrUdAYDpnXdsOwIABxxwj7YjAHDsYavbjgDAg/b4ftsRmLrl6rYjALDxphvajgDAnd+9se0IANQdG9uOAMD0Xru0HQGAjTeuazsCADvdv+0EHbd/7ettRwDgpvdd1XYEAH5w1U/ajgDAPg/Zre0IAEztOB7tYTdfcWvbEQC45cbxOJ/e3TjBlyRJkiRJXYITfEmSJEmSNFuAZTBm2WJZkiRJkjQAr7MsSZIkSdJsccyyJEmSJElbWQ7F8jIYli1JkiRJ0mBsWZYkSZIk9c8JviRJkiRJ2pqXjpIkSZIkqUvibNiSJEmSJG3NbtiSJEmSJHXx0lGSJEmSJM3BYlmSJEmSpNmWQ8vyMpjDTJIkSZKkwdiyLEmSJEnqX0Kc4EuSJEmSpB7LoBu2xbIkSZIkqX/LZDbskY5ZTrJbkg8lOT/JRUlO2MZ+p40ylyRJkiRpANNZ3DJBRt2y/BvAR6vq9UkC7LGN/U4DTh9dLEmSJElSP2LL8lDcARyT5F7VcUuS/0jy6SQXJjkgyVOAw5rW52eMOJ8kSZIkaV7pjFlezDJBRt2y/FZgX+BjSe4ATgV+q6pub4rk51bVXyS5qqqOnesJmi7apwHsu/Oq0aSWJEmSJC0rIy2Wq2oj8ErglUmOa9a/n+QhwE7A5X08x+k0XbQffI89aohxJUmSJEm97Ia99JIcmGRlc/NGYE9gz6p6DPB3wJZ33CJYkiRJksaV3bCX3IOBdyZZT6cw/n3gdUk+AXy9a7/zknwQeHNVfWDEGSVJkiRJ2xLI9Kinvxq9UXfDPgc4p2fzo+fY709Hk0iSJEmSNJjJayVejFG3LEuSJEmSJtxyGLNssSxJkiRJ6l9YFi3Ld/+O5pIkSZIkDciWZUmSJEnSQOyGLUmSJElSlyTOhi1JkiRJ0lZsWZYkSZIkqUvshi1JkiRJ0tYsliVJkiRJ6pKQqbv/mOW7/yuUJEmSJC2tqSxuWUCS45NcleTqJC+a4/4DkpyX5LIkX0nyhKG8PiyWJUmSJEljIMk08HrgBOBw4JQkh/fs9pfAu6rqocDJwBuGlcdu2JIkSZKkwQxnzPLDgaur6hqAJGcBTwau7NqngN2b9T2A7w4jCFgsS5IkSZIGEYY1Zvm+wHVdt68Hju7Z56XAx5P8HrAL8PhhBAG7YUuSJEmSBrLI8cqd1ujVSdZ2LacNePBTgDOraj/gCcBbkwylrrVlWZIkSZLUt2xfy/K6qlqzjfv+G9i/6/Z+zbZuzwGOB6iqi5KsAlYDNy420LbYsixJkiRJGsxwZsO+GDg0ycFJVtKZwOvsnn2uBX4RIMkDgVXATUv86oAJb1leue9eHPyyp7Ydg+/e+9fajgDAmZff0XYEAD5+9rfbjgDA969d13aEsfKi33xE2xEAOHHXD7YdAYAffODv2o4AwPrddmw7AjnsgLYjdAxnopCB7bDX7gvvNAJT++3UdgQANq67ue0IAKzc775tRwAg9zqi7QgATH/7v9qOAMDtN6xvOwIA37l2Q9sRGre2HQCAVbtOtx0BgPW3bW47AgCrdrZ9cBgyhL/bVbUpyfOBjwHTwBlVdUWSlwNrq+ps4I+ANyX533Qm+zq1qmrJwzDhxbIkSZIk6e6jqs4Fzu3Z9uKu9SuBR40ii8WyJEmSJKl/CQxnNuyxYrEsSZIkSRrIMLphjxuLZUmSJElS/4Ity5IkSZIkzRZbliVJkiRJ2ooty5IkSZIkdQkkd/9i+e7/CiVJkiRJGpAty5IkSZKkAQQcsyxJkiRJUg/HLEuSJEmSdJfE6yxLkiRJktQjsAwm+LJYliRJkiQNJHbDliRJkiSpS1gWE3z1/XVAkhPm2PY7SxtHkiRJkqT2DdJ2/ldJHrflRpI/BZ689JEkSZIkSeOrGbO8mGWCDNIN+0nAOUn+BDgeeAAWy5IkSZK07DhmuUtVrUvyJOA/gUuAk6qqhpZMkiRJkjR+gtdZBkhyK9BdFK8E7geclKSqavfFHjzJQcDFwBXNppdX1aea+04Frqqqixb7/JIkSZKkpRaSu/8EXwsWy1W125AzfLqqTurekGSqqs4c8nElSZIkSYthy/Jd0vnq4NeAg6vqFUn2B+5TVV9cqjBJrgS+APwoyS3A2qo6p2ef04DTAA7Y9x5LdWhJkiRJUj/CxE3WtRiDvMI3AI8AntHcvg14/RJkeGyS85OcD+wHvKCq/nBbO1fV6VW1pqrWrN571yU4vCRJkiSpfyFTU4taJskgs2EfXVVHJbkMoKpuTrJyCTL8tBt2kkur6uYleE5JkiRJkhZtkGJ5Y5Jpmsm+kuwDzCxxnqV+PkmSJEnSUlsG3bAHKZZfC7wfuFeSvwZOAv5yKKkkSZIkSeMpcYKvblX19iSXAL9IZ0j3r1bV17bn4FX1bTpF95bba7rWX7o9zy1JkiRJWnoBLx01h9XA7VX15iT7JDm4qr41jGCSJEmSpDFly/JdkrwEWAMcBrwZ2AF4G/Co4USTJEmSJI2fOGa5x1OAhwKXAlTVd5PsNpRUkiRJkqTx5HWWt7Khqoq7ZsPeZTiRJEmSJElq1yAty+9K8m/Ankl+G3g28KbhxJIkSZIkjacQxyzfpapeneQ44Md0xi2/uKo+MbRkkiRJkqTxtAy6YQ8ywddzgAuq6k+GmEeSJEmSNO4slmc5APi3JAcBlwAXAJ+pqi8NIZckSZIkaRzF2bBnqaqXACTZCfht4E+A/wNMDyWZJEmSJGk8WSzfJclf0rmm8q7AZcAfA58ZUi5JkiRJ0lhygq9eJwKbgA8DnwYuqqo7h5JKkiRJkqQW9f11QFUdBTwe+CJwHPDVJBcOK5gkSZIkaUwli1smyCDdsI8AHg08FlgDXIfdsCVJkiRpeQmOWe7xd3RmwH4tcHFVbRxOJEmSJEnS+HI27Fmq6onz3Z/kvVX11O2PJEmSJEkaaxbLA7nfEj5XX2rlnqzf/0mjPuxWPvSV8Zjn7OOXXNt2BABu+MZ3244AwNSPf9J2BABq153ajgDA7qvG4ypvm9aNx8/HhhvG5Odj00zbEdjxjvVtRwAgO65sOwIAm8fk/WBqPMZ11abNbUcAoDaMx9/aAaZ7GapxmYW22j+FATAmbwe33zYevy8b1o/Hf8yuey9lqbF4+xxzj7YjdHyh7QBLzGJ5ILWEzyVJkiRJGkt2w5YkSZIkabZlMsHXUr7C8egvJkmSJEnSdlqwWE7yyebfVy2w6wuXJJEkSZIkaYyFTim5mKU9SXZJ0vdEPv10w75PkkcCT0pyFj0tyFV1afPvxwdKKkmSJEmaTBPQDTvJFHAy8GvAzwF3AjsmWQd8GPi3qrp6W4/vp1h+MfBXwH7AP/bcV8DjFpFbkiRJkjSpJqBYBs4D/hP4M+Dyqs4c/kn2Bn4BeFWS91fV2+Z68ILFclW9B3hPkr+qqlcsXW5JkiRJ0uSZmNmwH19VG3s3VtUPgfcC702yw7Ye3PcrrKpXJHlSklc3yxMXl1eSJEmSNLG2zIa9mGWEqmpjkmcAJDl5W/ts6/F9Xzoqyd8CDwfe3mz6gySPrKo/HyCvJEmSJGmibZngayLcN8nT6QwrHsggr/CXgeOq6oyqOgM4HrB1WZIkSZI0dpK8BNibToPv3klePMjjB/06YM+u9T0GfKwkSZIk6e5gSN2wkxyf5KokVyd50Tb2eXqSK5NckeQ/tvVcVfUy4AfArwM/qKqXD/IS++6GDfwtcFmS8+i0uz8GmDO8JEmSJOlubAjjj5trIL8eOA64Hrg4ydlVdWXXPofSmd36UVV1c5J7LvC0N1TVWUlOGTRP38VyVb0jyfl0rk8F8MKq+l5X6AdV1RWDBpAkSZIkTZJQZBhP/HDg6qq6BiDJWcCTgSu79vlt4PVVdTNAVd24zZTJrlX19ma/d8yzz21z3TfQ1wFVdUNVnd0s3+u5+62DPJckSZIkaUItvhv26iRru5bTup71vsB1Xbevb7Z1uz9w/ySfTfL5JMfPk/KDSV6T5DFJdvlp9OR+SZ6T5GN05uKa0yDdsBcylK8WJEmSJEnjZtHdsNdV1ZrtOPAK4FDgWDozXF+Q5MFVdUvvjlX1i0meADwXeFSSvYBNwFXAucBvztEIPOtAS6WW8LkkSZIkSWMpw7pm8n8D+3fd3q/Z1u164AvN9ZG/leQbdIrni+d6wqo6l05hPLCJuTiWJEmSJOlu7WLg0CQHJ1kJnAyc3bPPB+i0KpNkNZ1u2dfM96RJPtnPtl5L2bK8YQmfS5IkSZI0pmoI7a5VtSnJ84GPAdPAGVV1RZKXA2ur6uzmvl9KciWwGfiTqvrBXM+XZBWwM51x0ntx19Dh3dl6LPRW+i6Wk7wP+L/AR6pqZo4Xdsw8j90J+Ehz82HAJc36NVX17H4zSJIkSZJaFiDDmbJqrm7TVfXirvUCXtAsC3ku8IfAvnRq0C2hfwy8bqEHD9Ky/AbgWcBrk7wbeHNVXdXPA6vqDu5qKl9bVccmORZ44gDHlyRJkiS1LkzCiN6q+mfgn5P8XlX9y6CPH+Q6y/8J/GeSPYBTmvXrgDcBb2sGWA/qiCTvB+4H/FpVXd4U02vgp4X19syUJkmSJElaQgXUcCb4Goqq+pckjwQOoqsGrqq3zPe4gcYsJ7kH8Ezg14HLgLcDPw/8Jk3L8YB2qKrjk5wAPJs+mtKb63CdBrD/Afst4pCSJEmSpMWbjJblLZK8FTgE+BKdcc7QqfmXplhuWoAPA94K/EpV3dDc9c4kawcN3PhS8+91wF5zHbZ3Q1WdDpwOcNTDjvRyVZIkSZI0YrV1qTbO1gCHN+Od+zZIy/Jrq+q8ue7Yjq7S3WG3vNubk+zWrN9vkc8rSZIkSRqWCeqGDVwO3Bu4YaEduw0yZvm8JEcAhwOrurbP23S9CK8HPgN8EfjuEj+3JEmSJGl5WQ1cmeSLwJ1bNlbVk+Z70CDdsF9CZ1zy4XSm8j4BuJAF+nn32tIKXVXnA+c365cDpzbrbxn0OSVJkiRJozNh3bBfupgHDdIN+yTgZ4HLqupZSe4FvG0xB5UkSZIkTarJmuCrqj69mMcNUizfUVUzSTYl2R24Edh/MQeVJEmSJE2umqBiOcmt3DVf1kpgB+AnVbX7fI8bpFhem2RPOtdVvgS4Dbho8KiSJEmSpMk2Od2wq2rLBNIkCfBk4JiFHjfIBF+/26y+MclHgd2r6iuDBpUkSZIkTbJMVMtyt+byUR9o5uR60Xz7LlgsJzlqvvuq6tLBI0qSJEmSJlEBlclpWU5yYtfNKTrXXV6/0OP6aVl+TfPvquZJv0ynzf0hwFrgEQMllSRJkiRpdH6la30T8G06XbHntWCxXFW/AJDkfcBRVfXV5vYRLHIKbkmSJEnS5JqkS0dV1bMW87hBOpoftqVQbg54OfDAxRxUkiRJkjSptlw6ajHL6CXZL8n7k9zYLO9Nst9Cjxsk7VeS/HuSY5vlTYATfEmSJEnSMlNkUUtL3gycDezbLB9qts1rkGL5WcAVwB80y5XNNkmSJEnSMlKVRS0t2aeq3lxVm5rlTGCfhR40yKWj1gP/1CxbSfLeqnpqv88nSZIkSZpME3bpqB8keSbwjub2KcAPFnrQUr7C+y3hc0mSJEmSxtBiu2C32A372cDTge8BNwAnAacu9KC+W5b7UEv4XJIkSZIkLYWXA79ZVTcDJNkbeDWdInqblrJYliRJkiQtA5N06SjgIVsKZYCq+mGShy70oKUslifq3ZIkSZIkLc6EFctTSfbqaVlesBYeqFhOshJ4AJ0u11dV1Yauu184yHNJkiRJkibThBXLrwEuSvLu5vbTgL9e6EF9F8tJfhl4I/BfdFqRD07y3Kr6CEBVfXzgyNtp48w031u/x6gPu5Vrbryx7QgA3PLD29qOAEBuX992BABWrN+w8E4jsGHnVW1HGCu1eXPbEQCY2TAeOWrjTNsRYMV02wkAyNR4zKqZjMkf/zF5P6Z2cMTWLLWp7QQA1KbxyDEzMx5T1uy4w3j83u6+93j8vqzcZTzO67seuHPbEQDY6/gHtR2hY8HSbHIUrV4GamBV9ZYka4HHNZtOrKorF3rcIL/RrwF+oaquBkhyCPBh4CODhpUkSZIkTa4Ja1mmKY4XLJC7DVIs37qlUG5cA9w6yMEkSZIkSZNv0orlxRikWF6b5FzgXXTGLD8NuDjJiQBV9b4h5JMkSZIkjZlJ6oa9WIMUy6uA7wOPbW7fBOwE/Aqd4tliWZIkSZJ0t9B3sVxVzxpmEEmSJEnSZLAbdpckb6bTgjxLVT17SRNJkiRJksZWAWNwPY+hG6Qb9jld66uApwDfXdo4kiRJkqSxVpN16ajFGqQb9nu7byd5B3DhkieSJEmSJI01u2HP71DgnksVRJIkSZI0/gqY2WqA7t3PIGOWb2X2mOXvAS9c8kSSJEmSpLFmN+zZ9qmq9d0bkuy9xHkkSZIkSWrd1AD7vjfJT4vrJPcGPrH0kSRJkiRJ42xmkcskGaRY/gDw7iTTSQ4CPg782TBCSZIkSZLGVzUzYg+6TJJBZsN+U5KVdIrmg4DnVtXnhpRLkiRJkjSGvM5yI8kLum8CBwBfAo5JckxV/eOQskmSJEmSxtCktRIvRj8ty7v13H7fNrZLkiRJku7uyktHAVBVLxtFEEmSJEnS+CuglkGx3PcEX0k+kWTPrtt7JfnYPPv/e5Kjm/U/TPKRZj1JrlzgWEckObPfbJIkSZIkLaVBZsPep6pu2XKjqm4G7jnP/p8Hjm7WjwI2Nev3B74+wHElSZIkSWMjzCxymSR9z4YNbE5yQFVdC5DkQDot8NvyeeDPgdcCOwNfTXJ/4Bjg80k+AOwO3AD8Bp3Jw/4D2Bv4zoCvQ5IkSZI0Io5Znu0vgAuTfJpOYfto4LR59r8SeGCSewI3Al+g09J8NHAY8NdV9akkLwSe0jzm6qr68yS/Q6eo3kqS07Ycd9/9DxggviRJkiRpexXLo1juuxt2VX2UTnfqdwJnAQ+rqm2OWa6qGWAd8ETgi81ydPMcU8DLkpwPnAjcG/gZ4JLm4RfP87ynV9Waqlqz9z1W9xtfkiRJkrQUqjPB12KWSdLPdZYfUFVfT3JUs+m7zb8HNN2yL53n4V8Afh84papuSHIInS8iLgPeX1WfaY6xA/Bk4KHAe4E1i3s5kiRJkqRhWw4ty/10w34BnW7Pr2H2GOU0tx83z2M/D/wed03odRvwX8BfA29KsuWyVH8KfAA4OckngW/0mV+SJEmSNELLpRt2P9dZ3jIu+QnA7wI/T+f9+Qzwrws89hxgj67bT+u6+8Q5HnLSQnkkSZIkSe2yWJ7t/wE/pjO7NcAzgLcAT1/qUJIkSZIktWmQYvmIqjq86/Z5Sa5c6kCSJEmSpPG1XLph9z0bNnBpkp9ezinJ0cDapY8kSZIkSRpb1SmWF7MsJMnxSa5KcnWSF82z31OTVJKhTQ49SMvyw4DPJbm2uX0AcFWSrwJVVQ9Z8nSSJEmSpLEzjJblJNPA64HjgOuBi5OcXVVX9uy3G/AHdK6+NDSDFMvHDy2FJEmSJGkiDLEb9sOBq6vqGoAkZ9G5xHDv8N9XAK8C/mQoKRp9F8tV9Z1hBpEkSZIkTYJiphZdLa9O0j2c9/SqOr1Zvy9wXdd91wNHdz84yVHA/lX14STjUSxLkiRJkrSdLcvrqmpR44yTTAH/CJy66KMPYJAJviRJkiRJGpb/Bvbvur1fs22L3YAjgPOTfBs4Bjh7WJN82bIsSZIkSepfwebhjFm+GDg0ycF0iuSTgWf89LBVPwJWb7md5Hzgj6tqKFdpsliWJEmSJPVtWBN8VdWmJM8HPgZMA2dU1RVJXg6sraqzl/6o22axLEmSJEkayHZM8DWvqjoXOLdn24u3se+xQwnRsFiWJEmSJA1kZqbtBMNnsSxJkiRJ6lsVbB5Sy/I4sViWJEmSJA1kGGOWx42XjpIkSZIkqYcty5IkSZKkvnVmw777Ny1PdLF8+0b40g2b247BN6+/ue0IANyx7sdtRwBg1Y9/0nYEAFZsav9nA2DDmMx+MC7jSqZW7dR2BACmdxqP09/UGOSY2nFl2xGA8fnZmNq4se0IAEyvWtV2BABmbru97QgA1OZNbUcAIDMb2o4AwMzG8Xg/ZjaNx9+W3XebbjsCAKv2aP+cDrDDbuORY+dD92g7AgCrDjq07Qh3S5vH4yPuUI3Hb5IkSZIkaSJU2bIsSZIkSdJWxqTz5FBZLEuSJEmS+lbU2AzxGyaLZUmSJEnSQOyGLUmSJElSl6rlMcGX11mWJEmSJKmHLcuSJEmSpIHYDVuSJEmSpC4FbJ6xWJYkSZIk6S4Fy6BWtliWJEmSJPXPlmVJkiRJknp4nWVJkiRJknoVzCyDlmUvHSVJkiRJUg9bliVJkiRJfXPMsiRJkiRJc1gGtbLFsiRJkiSpf1W2LEuSJEmS1MPZsCVJkiRJmqVYHrNhj6xYTrIT8JHm5sOAS5r1E6vqh6PKIUmSJEnaDnbDXlpVdQdwLECStVV17KiOLUmSJEnSIFq7znKSlyZ5YrP+/CSnNut/nuTTSS5I8uA5HndakrVJ1v745ptGnFqSJEmSlrcCNlctapkkrRXLc0lyBHBYVT0WOBl4Ze8+VXV6Va2pqjW777XPyDNKkiRJ0nI3M1OLWiZJmxN8db9Taf49HHhkkvOb25tHmkiSJEmSNK+qcszykN0M7Nes/yxwIfB14NNV9VsASXZoKZskSZIkaRsslofrPcDZSZ4A3ApQVV9J8s0knwZmgE8Af9NiRkmSJElSlwJmJmz88WK0UixX1Zpmdc0c970KeNVoE0mSJEmS+uKloyRJkiRJmq1g4ibrWoyxmg1bkiRJkqRxYMuyJEmSJGkAzoYtSZIkSdIs5ZhlSZIkSZK2thzGLFssS5IkSZL6VsBmLx0lSZIkSVKXKmZmZtpOMXQWy5IkSZKkvhXLY8yyl46SJEmSJKmHLcuSJEmSpP6VE3xJkiRJkjTLcumGbbEsSZIkSRqILcuSJEmSJHWpKotlSZIkSZJ6bfbSUZIkSZIk3aWWyQRfXjpKkiRJkqQeE92yvLmKW+/c3HaMsflWJSum244AwOYxyaHZbrptY9sRAFix/4PbjgDAHo/9XtsRAJjeZee2I7DygAe2HQGATI3Hn6Tp3X7QdgQAssOObUcAYGrnXdqO0DE9Jn9bqv3PHQBTO4zH78ueh7R/DgPYs+0Ajd2O2qftCADsuN9ebUcAYKf7H9J2BADuuOfj2o5wtzQuNdAwjceZVpIkSZI0EarKS0dJkiRJktRrObQsO2ZZkiRJktS3AmZqZlHLQpIcn+SqJFcnedEc978gyZVJvpLkk0kOHMZrBItlSZIkSdIgmtmwF7PMJ8k08HrgBOBw4JQkh/fsdhmwpqoeArwH+PshvELAYlmSJEmSNIBicYVyH123Hw5cXVXXVNUG4CzgybOOXXVeVd3e3Pw8sN+Sv8CGY5YlSZIkSQOZmVm4S/U2rE6ytuv26VV1erN+X+C6rvuuB46e57meA3xksUEWYrEsSZIkSRqVdVW1ZnufJMkzgTXAY7c/0twsliVJkiRJ/auhzYb938D+Xbf3a7bNkuTxwF8Aj62qO4cRBCyWJUmSJEkDKKCGUyxfDBya5GA6RfLJwDO6d0jyUODfgOOr6sZhhNjCYlmSJEmS1L+q7RmzPM/T1qYkzwc+BkwDZ1TVFUleDqytqrOBfwB2Bd6dBODaqnrSkofBYlmSJEmSNKAhdcOmqs4Fzu3Z9uKu9ccP5cBzsFiWJEmSJPWtCmrzcIrlcWKxLEmSJEkawHC6YY+bqbYDSJIkSZI0bmxZliRJkiQNZEizYY8Vi2VJkiRJUv/KYlmSJEmSpFk611l2zPJ2S3JQkkryC83tlUlubq6fJUmSJEmaJFXUzOKWSTKqCb7WAic2648Hvjmi40qSJEmSlljNzCxqmSSjKpa/AxyQJMBTgPcBJHlBkouSXJjkqGbbpUlel+QLSV44onySJEmSpD7Zsry0LgIeA+wDfA/YE/hV4FHAM4FXNfvtCfwD8Ejg13ufJMlpSdYmWXvbzeuGHlqSJEmStPyMslh+L/BPwPld275cVTNV9W06RTLAzVX1naraDKzvfZKqOr2q1lTVml33Wj3kyJIkSZKkWcqW5SVVVd8ELgTe07X5yCRTSQ4Cbtmy66gySZIkSZIGVTAzs7hlgoz00lFV9fsAnaHL3AJ8EPgcMAP83iizSJIkSZIGV15neWk0XaxP6tl2ZtfNV/fct2audUmSJEnSmJiwVuLFGGnLsiRJkiRp0k3e+OPFsFiWJEmSJPWvsGVZkiRJkqStLIOW5VFeOkqSJEmSpIlgy7IkSZIkqX9VxG7YkiRJkiT1WAbdsC2WJUmSJEl9C9iyLEmSJEnSLIUty5IkSZIkzeaYZUmSJEmSZqvl0Q3bS0dJkiRJktTDlmVJkiRJ0kCmHLMsSZIkSdJd4phlSZIkSZJ6lC3LkiRJkiRtxZZlSZIkSZK6pIqpZVAsp2pym8+T3AR8ZzufZjWwbgnibC9zzGaO2cwxmzlmM8ds5pjNHLONQ45xyADm6GWO2cwx21LkOLCq9lmKMG1L8lE678lirKuq45cyz7BMdLG8FJKsrao15jCHOcxhDnOYwxzLJYM5zGGOycyh0fI6y5IkSZIk9bBYliRJkiSph8UynN52gIY5ZjPHbOaYzRyzmWM2c8xmjtnGIcc4ZABz9DLHbOaYbVxyaISW/ZhlSZIkSZJ62bIsSZIkSVIPi2VJkiRJknos62I5yauSfCbJW5Ps0GKOPZJ8McltSY5oKcPDk1yU5IIk72jr/UhyrySfS/LpJJ9Kcp82cnTlOaW5nndbxz8oyU1Jzm+W1q7Nl+TYJJ9Mcl6Sp7SU4RFd78U3kvxTSzmmkpzZnD8uTPKAlnJMJ3lb839yRpIVIzz2VuetJE9rfn8/mWS/FnO8pfm9ef4oMsyVI8luzTnsgubfA9vI0Wx7b3NO/UKSx7SVo9l+YJI7R/W3bhvvxze7ziPHtZhjvyRnN7+/L2sjR5Kdut6LLya5rI0czbbnNdu+mOSpLeb4oySfTfKxUX0Gmesz2KjPp9vI0Ma5tDdHW+fSud6PkZ9LNQaqalkuwM8Cb2vW/wI4pcUsOwD7AGcCR7SU4T7ATs363wIntZRjGphq1k8F/rLF/5dp4H3ApS1mOAh4T1vH78qxE/AhYGXbWboynQk8tqVjHwW8o1l/NHB6SzlOAl7RrP8p8PQRHnvWeQtYAVwErAQeBfxbGzmabfs254/nt/h+rAL2be77H8DrWnw/Vjb/HgT8Z1s5mu2vBz41qr9123g/1o7q52KBHO8A7tt2jq77TgVe0uL7cUVzHtkZuKSNHMC9m5/PAA8H3jCiHFt9Bhv1+XQbGdo4l/bmeGZL59K53o+Rn0td2l+Wc8vyI4GPN+sfpXMyakVVbayq1lovmww3VNUdzc0NwExLOTZX1ZZj70bnj2dbTgHeTUvvRZdHNS2Yf5MkLWV4BHAH8KEk709y75ZyAJBkJZ0PMp9pKcL1nRgJsBewrqUchwBfatYvBUb2Tfcc561Dga9V1Yaq+izwkJZyUFXfHcWx58tRVeu7cozsnLqN92NDs7obcHlbOZIcDBRw7SgybCsHsGvTOvQfSfZuI0c6vbcOAl7TtJY9so0cPZ4GvKvFHNfQ+WJ2N+CWlnIcCFxRVUXnnProEeXo/Qx2GCM+n871ObClc2lvjttbOpfO9X6M/Fyq9i3nYnkv4MfN+o+AkfzBHHdN95ZfotOK2FaGI5N8AXg+nT9WbWSYBp4OvLON43e5AfgZOkXQPYETW8pxrybHrwBvAl7aUo4tHg98suuLlVFbB2wEvg78C/CGlnJcCTyuWX88nfNaW7rPqdDpmbHsNV/svJTOz0mbOS4APgGc22KMFwKvbvH4Wzyqqh5L54vykXR/nsNq4Eg6PUKeAfxzSzkASLIncO+q+lqLMT4MfI3OF4CvaSnDfwFrkuxI55w60s+GXZ/BLqSl8+k4fA6cK0db59I5cozDuVQjtJyL5VuA3Zv1PYAfthdlPCTZHXgrcGpVbWwrR1V9qaqOBv4K+LOWYjwTeFeLxRgAVXVnVf2k+Zb7fXSGD7ThFuCzzbeqnwQe1FKOLZ5Gp9W/Lb8EbKqqw4Cn0t4Hu3OA9Uk+BewCfK+lHDD7nAqwuaUc4+Z0Ol05v9lmiKp6DJ3eGK9q4/hJDmlyfLuN43erqh80q++h3XPq1VV1bVV9D9iYEc45MIcnAx9s6+DN54//RaeHygOAV7TRk6qq1gH/Sqfn4Ql0vhAdie7PYMBNtHA+HZfPgdvIMfJz6Vw52j6XavSWc7H8OTrfGkJnDMRnW8zSuuaP9FnAy6rqqhZzrOy6+SPg9paiHA78RpKPAocmeW0bIZLs1nXz0cDVbeQALgYe2Hx4OZJOd7lWNN0Xf47ON++txQC2fOBeR+cLt5Grjj+qqsc1eVr7sAt8k87PyMqmS+lXWswyFpK8BLimqlrroZKOLRM23tYsbfhZ4EHNOfU44I1JVo06RPPzuWNzs7VzatO98wdJ9kyyC7BjVW1qI0tjZF2wt2GGzlCf9cBP6IzVbWXYUVW9pel58H7g/FEcc47PYCM/n47R58CtcrRxLu3NMUbnUo1YOg1Wy1OSfwCOoTN+6lldYxHayHIunSLkO3QmcjhzxMf/deD/AF9tNv1rGx/wkjycTje9zXT+aD67qm4YdY6eTGurak1Lxz4BeCWdLw2+Ref9aOUDVZLnAf+TzpjDZ1fVf7WU4wTghKr6/TaO32RYAbydzmQwOwIvqKrPtZDj3nQmCZqh0y39b0Z8/FnnLTofdv+Azu/ub1bVdS3lOAx4Ep2uix+pqv/dQo5zgZdw15c6F1XVSHrK9OT4f8DJzV3TwJ9X1UjG+m/r71qSM4FXV9VIxvz15PgAnSE2PwHupHMua+vn9Bt0WqdW0pmo75yWcryfzvljpH/n5sixZajRFHBGVb2xpRzHN1m+Azyvqob+pf1cn8Gaf0d2Pt1GhiMZ8bl0jhxvBv4vIz6XbiPHs5r1kZ5L1a5lXSxLkiRJkjSX5dwNW5IkSZKkOVksS5IkSZLUw2JZkiRJkqQeFsuSJEmSJPWwWJYkSZIkqYfFsiRJkiRJPSyWJUkTJ8m/Jzl8nvtfmuSPh3TsU5O8bhjPLUmSxseKtgNIkjSoqvqttjMstSQrqmpT2zkkSVKHLcuSpLGV5KAkX0/y9iRfS/KeJDsnOT/Jmmaf45NcmuTLST45x3P8dpKPJNkpyW1d209KcmazfmaSNyZZm+QbSZ64QLR9k3w0yTeT/H3Xc56S5KtJLk/yqq7tCx33C8DfI0mSxoYty5KkcXcY8Jyq+mySM4Df3XJHkn2ANwGPqapvJdm7+4FJng8cB/xqVd2ZZL7jHAQ8HDgEOC/Jz1TV+m3seyTwUOBO4Kok/wJsBl4FPAy4Gfh4kl+tqg8s8Pr2Ax5ZVZsX2E+SJI2QLcuSpHF3XVV9tll/G/DzXfcdA1xQVd8CqKofdt33G8AJwElVdWcfx3lXVc1U1TeBa4AHzLPvJ6vqR00xfSVwIPBzwPlVdVPTnfrtwGP6OO67LZQlSRo/FsuSpHFXC9zelq/SaS3ebxuPXbUdx+kuvjezcE+t+Y77kwUeK0mSWmCxLEkadwckeUSz/gzgwq77Pg88JsnBAD3dsC8DngucnWTfZtv3kzwwyRTwlJ7jPC3JVJJDgPsBVw2Y84vAY5OsTjINnAJ8uo/jSpKkMWSxLEkad1cBz0vyNWAv4F+33FFVNwGnAe9L8mXgnd0PrKoLgT8GPpxkNfAi4Bzgc8ANPce5lk7B+xHgd+YZrzynqrqhef7zgC8Dl1TVB5u75zuuJEkaQ6nqtzebJEmjleQg4JyqOmLIxzmzOc57hnkcSZI0OWxZliRJkiSphy3LkiTNIcn/oHMpqG7fqirHHEuStAxYLEuSJEmS1MNu2JIkSZIk9bBYliRJkiSph8WyJEmSJEk9LJYlSZIkSerx/wGpamiEh7/jpwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 1080x360 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Number of pick-ups per hour for a given day of the week\n",
    "df.plot('pickup_hour', 'pickup_day_of_week', colorbar=True, colormap=cm_plusmin, figsize=(15, 5))\n",
    "\n",
    "plt.xticks(np.arange(24), np.arange(24))\n",
    "plt.yticks(np.arange(7), weekday_names_list)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Группирование"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:15:24.313419Z",
     "start_time": "2020-07-31T01:15:11.180294Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                             </th><th>pickup_hour  </th><th>tip_amount        </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i> </td><td>0            </td><td>1.0583148055719316</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i> </td><td>1            </td><td>1.0395988285791358</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i> </td><td>2            </td><td>1.0271794254275992</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i> </td><td>3            </td><td>1.0004258190973754</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i> </td><td>4            </td><td>0.9259400499895432</td></tr>\n",
       "<tr><td>...                           </td><td>...          </td><td>...               </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>19</i></td><td>19           </td><td>1.0345855021860877</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>20</i></td><td>20           </td><td>1.0202804182400866</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>21</i></td><td>21           </td><td>1.0258125215768232</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>22</i></td><td>22           </td><td>1.0711251054555473</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>23</i></td><td>23           </td><td>1.077161833123374 </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#    pickup_hour    tip_amount\n",
       "0    0              1.0583148055719316\n",
       "1    1              1.0395988285791358\n",
       "2    2              1.0271794254275992\n",
       "3    3              1.0004258190973754\n",
       "4    4              0.9259400499895432\n",
       "...  ...            ...\n",
       "19   19             1.0345855021860877\n",
       "20   20             1.0202804182400866\n",
       "21   21             1.0258125215768232\n",
       "22   22             1.0711251054555473\n",
       "23   23             1.077161833123374"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_per_hour = df.groupby(by=df.pickup_hour).agg({'tip_amount': 'mean'})\n",
    "df_per_hour"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-31T01:15:30.915251Z",
     "start_time": "2020-07-31T01:15:30.550747Z"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0AAAAFNCAYAAAApYg+1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAjs0lEQVR4nO3debgkZXn38e+PGZAdVAYXFgcNLmiI4AiYGBdEBYyggSi4RHGZBIO7Jii+CJrF3SwSFBFUVBZRfEcFwahoXhOUYV81I+uAAiqCQmS93z+6RtvjWXqGU9Vnpr6f6+pruqvq9H2fZer07zxPPZ2qQpIkSZL6YK1xNyBJkiRJXTEASZIkSeoNA5AkSZKk3jAASZIkSeoNA5AkSZKk3jAASZIkSeoNA5Akac5K8vYkR4+7D0nSmsMAJEk9leSqJHcm2WzC9vOSVJKFHffztCTLh7dV1T9W1au67KNrzfdht3H3IUl9YQCSpH67Eth/xYMkfwisP752JElqlwFIkvrtOOAvhx6/DPj08AFJ7pfkA0muSXJDko8mWa/Zd/8kX0lyU5Kbm/tbDn3smUneneS7SX6Z5IyJI07NcRsApwEPTfKr5vbQJIcl+UxzzMJmZGpxkuuT/DjJW6b6xJI8pxnNujXJtUkOG9q34rkOaPbdnOSvkzwxyYVJfpHkI0PHr5XkHUmuTnJjkk8n2aTZ93sjV8OjOs3ncFLzMb9MckmSRc2+44CtgS83n/PfTv/tkiTdVwYgSeq3s4CNkzwmyTxgP+AzE455D/BI4PHAHwBbAIc2+9YCjgUexuCF/P8CH5nw8S8CDgA2B9YBfi+0VNVtwB7A9VW1YXO7foqenw5sCzwL+Ltppo/dxiDcbQo8BzgwyfMmHLNz81wvBP4ZOATYDXgs8IIkT22Oe3lzezrwcGDDST7P6ewFnND0smTFx1bVS4FrgOc2n/P7VuI5JUmrwAAkSVoxCvRM4DLguhU7kgRYDLyxqn5eVb8E/pFBUKKqflZVX6iq25t9/wA8dcLzH1tVP6yq/wVOYhCk7ovDq+q2qrqIQfjaf7KDqurMqrqoqu6tqguB4yfp7d1V9euqOoNBYDq+qm6squuA/wR2aI57MfChqrqiqn4FvA3YL8n8EXv+f1V1alXdw+Dr/Ucr8wlLkmbPqCduSdKa6zjgO8A2TJj+BixgcE3QOYMsBECAeQBJ1gc+DOwO3L/Zv1GSec2LfYCfDD3f7QxGT+6La4fuXw384WQHJdmZwejV4xiMPN0P+PyEw24Yuv+/kzxe0etDm1rDdecDDxqx54lfg3WTzK+qu0f8eEnSLHEESJJ6rqquZrAYwp7AFyfs/imDIPDYqtq0uW1SVSuCwZuBRwE7V9XGwFOa7WHl1YjHbTV0f2tgqqlyn2Mw3WyrqtoE+Ogq9kVT42ET6t7NIDDdxtDCEc1UwgUr8dyjft6SpFlgAJIkAbwS2LW5Fuc3qupe4OPAh5NsDpBkiyTPbg7ZiEFA+kWSBwDvvA893AA8cMXiAtP4P0nWT/JYBtcWnTjFcRsBP6+qXyfZicG1SKvqeOCNSbZJsiGDaYAnNiM4P2QwovOcJGsD72Aw2jSqGxhcVyRJ6oABSJJEVf2oqpZOsfvvgGXAWUluBf6DwagPDBYOWI/BSNFZwNfuQw+XMwgaVzSrsD10ikO/3fTzDeADzfU7k3kN8K4kv2SwaMNJq9obcAy/nSp4JfBr4LVN37c0tY5mcP3UbcDyyZ9mUv8EvKP5nKdc1U6SNDtS5ci7JGnua96Y9Upgba+dkSStKkeAJEmSJPWGAUiSJElSbzgFTpIkSVJvOAIkSZIkqTcMQJIkSZJ6Y/64G1hZm222WS1cuHDcbUiSJEmao84555yfVtWkb0q92gWghQsXsnTpVG9VIUmSJKnvklw91T6nwEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN4wAEmSJEnqjfnjbkCSJEnSmuvGj5zWSZ3ND9pjpOMMQJIkSdIa6oZ/+e9O6jzo9U/qpM5scAqcJEmSpN4wAEmSJEnqjdV6CtxNR36mkzoLDnxJJ3UkSZIktcsRIEmSJEm9YQCSJEmS1BsGIEmSJEm9YQCSJEmS1Bur9SIIc8ENR76/kzoPOvCtndSRJEmS1mQGIEmSJLXi7GNv7KTOEw/YvJM6WjM4BU6SJElSbzgCtJq75l/37aTO1q87uZM6kiRJa5KffPDyTuo8+M2P7qTOmsARIEmSJEm94QiQJEmS1khX/fNPOqmz8A0P7qSOZkdrI0BJjklyY5KLp9ifJP+aZFmSC5Ps2FYvkiRJkgTtToH7JLD7NPv3ALZtbouBI1vsRZIkSZLaC0BV9R3g59Mcsjfw6Ro4C9g0yUPa6keSJEmSxnkN0BbAtUOPlzfbfjzxwCSLGYwSsfXWW3fSnEZ39see20mdJ/7VlzupI0mSpDXXarEIQlUdBRwFsGjRohpzO5LmqANOmW7W7ew59vlf66SOJEmafeNcBvs6YKuhx1s22yRJkiSpFeMMQEuAv2xWg9sFuKWqfm/6myRJkiTNltamwCU5HngasFmS5cA7gbUBquqjwKnAnsAy4HbggLZ6kSRJkiRoMQBV1f4z7C/gb9qqL0mSJEkTrRaLIEjS6mDPL725kzqnPu+DndSRJGlNNM5rgCRJkiSpU44ASZIkrYFOO/GnndTZ44WbdVJHmi2OAEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN4wAEmSJEnqDQOQJEmSpN7wjVAlSZJacPQXb+ykzqv+fPNO6khrCgOQ1ginf2LPTuo8+5WnTrr9xGN376T+Cw/4Wid1JEmS1lROgZMkSZLUGwYgSZIkSb3hFDhJs+IfTnx2J3UOeeHpndSRJElrJkeAJEmSJPWGI0CStAZ5zinv76TOV5//1k7qSJI02xwBkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJveEiCJKkWfOcLxzVSZ2v7rO4kzpafb3ulGs7qfOvz9+qkzqSZo8jQJIkSZJ6wwAkSZIkqTecAidJWqP82cmf7aTOV/Z9cSd1JEmzyxEgSZIkSb3hCJAkSbPouSd/qbNaX973eZNuf97J3+ik/pf2fUYndSRpNhmApDXEx457did1/uqlp3dSR5IkqQ1OgZMkSZLUG44ASZKkWbfvF87tpM7J++zYSR1Jaw5HgCRJkiT1hgFIkiRJUm8YgCRJkiT1hgFIkiRJUm8YgCRJkiT1hgFIkiRJUm8YgCRJkiT1hgFIkiRJUm+0GoCS7J7kB0mWJTl4kv1bJ/lWkvOSXJhkzzb7kSRJktRvrQWgJPOAI4A9gO2A/ZNsN+GwdwAnVdUOwH7Av7fVjyRJkiS1OQK0E7Csqq6oqjuBE4C9JxxTwMbN/U2A61vsR5IkSVLPzW/xubcArh16vBzYecIxhwFnJHktsAGwW4v9SJIkSeq5cS+CsD/wyaraEtgTOC7J7/WUZHGSpUmW3nTTTZ03KUmSJGnN0GYAug7Yaujxls22Ya8ETgKoqv8G1gU2m/hEVXVUVS2qqkULFixoqV1JkiRJa7o2A9DZwLZJtkmyDoNFDpZMOOYa4BkASR7DIAA5xCNJkiSpFa0FoKq6GzgIOB24jMFqb5ckeVeSvZrD3gy8OskFwPHAy6uq2upJkiRJUr+1uQgCVXUqcOqEbYcO3b8U+JM2e5AkSZKkFca9CIIkSZIkdcYAJEmSJKk3DECSJEmSesMAJEmSJKk3DECSJEmSesMAJEmSJKk3DECSJEmSesMAJEmSJKk3DECSJEmSemPGAJTkvaNskyRJkqS5bpQRoGdOsm2P2W5EkiRJkto2f6odSQ4EXgM8PMmFQ7s2Ar7bdmOSJEmSNNumDEDA54DTgH8CDh7a/suq+nmrXUmSJElSC6YMQFV1C3ALsH+SecCDmuM3TLJhVV3TUY+SJEmSNCumGwECIMlBwGHADcC9zeYCtm+vLUmSJEmafTMGIOANwKOq6mct9yJJkiRJrRplFbhrGUyFkyRJkqTV2igjQFcAZyb5KnDHio1V9aHWupIkSZKkFowSgK5pbus0N0mSJElaLc0YgKrq8C4akSRJkqS2jbIK3LcYrPr2O6pq11Y6kiRJkqSWjDIF7i1D99cF9gHubqcdSZIkSWrPKFPgzpmw6btJvt9SP5IkSZLUmlGmwD1g6OFawBOATVrrSJIkSZJaMsoUuHMYXAMUBlPfrgRe2WZTkiRJktSGUabAbdNFI5IkSZLUtlGmwK0NHAg8pdl0JvCxqrqrxb4kSZIkadaNMgXuSGBt4N+bxy9ttr2qraYkSZIkqQ2jBKAnVtUfDT3+ZpIL2mpIkiRJktqy1gjH3JPkESseJHk4cE97LUmSJElSO0YZAXor8K0kVzBYCe5hwAGtdiVJkiRJLRhlFbhvJNkWeFSz6QdVdUe7bUmSJEnS7BtlFbh5wLOBhc3xuyWhqj7Ucm+SJEmSNKtGmQL3ZeDXwEXAve22I0mSJEntGSUAbVlV27feiSRJkiS1bJRV4E5L8qzWO5EkSZKklo0yAnQWcEqStYC7GKwEV1W1caudSZIkSdIsGyUAfQh4EnBRVVXL/UiSJElSa0aZAnctcLHhR5IkSdLqbpQRoCuAM5OcBvzm/X9GWQY7ye7AvwDzgKOr6j2THPMC4DCggAuq6kWjtS5JkiRJK2eUAHRlc1unuY2kef+gI4BnAsuBs5MsqapLh47ZFngb8CdVdXOSzVemeUmSJElaGTMGoKo6fBWfeydgWVVdAZDkBGBv4NKhY14NHFFVNze1blzFWpIkSZI0oxkDUJIFwN8CjwXWXbG9qnad4UO3YHD90ArLgZ0nHPPIpsZ3GUyTO6yqvjZz25IkSZK08kZZBOGzwOXANsDhwFXA2bNUfz6wLfA0YH/g40k2nXhQksVJliZZetNNN81SaUmSJEl9M0oAemBVfQK4q6q+XVWvAGYa/QG4Dthq6PGWzbZhy4ElVXVXVV0J/JBBIPodVXVUVS2qqkULFiwYobQkSZIk/b5RAtBdzb8/TvKcJDsADxjh484Gtk2yTZJ1gP2AJROO+RKD0R+SbMZgStwVIzy3JEmSJK20UVaB+/skmwBvBv4N2Bh440wfVFV3JzkIOJ3B9T3HVNUlSd4FLK2qJc2+ZyW5FLgHeGtV/WwVPxdJkiRJmtYoq8B9pbl7C/D0lXnyqjoVOHXCtkOH7hfwpuYmSZIkSa0aZQqcJEmSJK0RDECSJEmSesMAJEmSJKk3ZgxASR6Y5N+SnJvknCT/kuSBXTQnSZIkSbNplBGgE4AbgX2AfYGbgBPbbEqSJEmS2jDKMtgPqap3Dz3++yQvbKshSZIkSWrLKCNAZyTZL8laze0FDN6/R5IkSZJWK6MEoFcDnwPuAO5kMCXur5L8MsmtbTYnSZIkSbNplDdC3aiLRiRJkiSpbVMGoCSPrqrLk+w42f6qOre9tiRJkiRp9k03AvQmYDHwwUn2FbBrKx1JkiRJUkumDEBVtbi5u0dV/Xp4X5J1W+1KkiRJklowyiII/zXiNkmSJEma06a7BujBwBbAekl2ANLs2hhYv4PeJEmSJGlWTXcN0LOBlwNbMrgOaEUAuhV4e7ttSZIkSdLsm+4aoE8Bn0qyT1V9ocOeJEmSJKkVM14DZPiRJEmStKYYZREESZIkSVojGIAkSZIk9cZ0iyD8RpI/BhYOH19Vn26pJ0mSJElqxYwBKMlxwCOA84F7ms0FGIAkSZIkrVZGGQFaBGxXVdV2M5IkSZLUplGuAboYeHDbjUiSJElS20YZAdoMuDTJ94E7Vmysqr1a60qSJEmSWjBKADqs7SYkSZIkqQszBqCq+nYXjUiSJElS22a8BijJLknOTvKrJHcmuSfJrV00J0mSJEmzaZRFED4C7A/8D7Ae8CrgiDabkiRJkqQ2jBKAqKplwLyquqeqjgV2b7ctSZIkSZp9oyyCcHuSdYDzk7wP+DEjBidJkiRJmktGCTIvbY47CLgN2ArYp82mJEmSJKkNo6wCd3WS9YCHVNXhHfQkSZIkSa0YZRW45wLnA19rHj8+yZKW+5IkSZKkWTfKFLjDgJ2AXwBU1fnANq11JEmSJEktGSUA3VVVt0zYVm00I0mSJEltGmUVuEuSvAiYl2Rb4HXAf7XbliRJkiTNvlFGgF4LPBa4AzgeuBV4Q4s9SZIkSVIrRlkF7nbgkOYmSZIkSautGQNQkkXA24GFw8dX1fbttSVJkiRJs2+UKXCfBT7J4M1Pnzt0m1GS3ZP8IMmyJAdPc9w+SaoJW5IkSZLUilEWQbipqlb6fX+SzAOOAJ4JLAfOTrKkqi6dcNxGwOuB761sDUmSJElaGaMEoHcmORr4BoOFEACoqi/O8HE7Acuq6gqAJCcAewOXTjju3cB7gbeO2rQkSZIkrYpRAtABwKOBtYF7m20FzBSAtgCuHXq8HNh5+IAkOwJbVdVXkxiAJEmSJLVqlAD0xKp61GwXTrIW8CHg5SMcuxhYDLD11lvPdiuSJEmSemKURRD+K8l2q/Dc1wFbDT3estm2wkbA44Azk1wF7AIsmWwhhKo6qqoWVdWiBQsWrEIrkiRJkjTaCNAuwPlJrmRwDVCAGmEZ7LOBbZNswyD47Ae8aMXOqroF2GzF4yRnAm+pqqUr9RlIkiRJ0ohGCUC7r8oTV9XdSQ4CTgfmAcdU1SVJ3gUsXZWV5SRJkiTpvpgxAFXV1av65FV1KnDqhG2HTnHs01a1jiRJkiSNYpRrgCRJkiRpjWAAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvdFqAEqye5IfJFmW5OBJ9r8pyaVJLkzyjSQPa7MfSZIkSf3WWgBKMg84AtgD2A7YP8l2Ew47D1hUVdsDJwPva6sfSZIkSWpzBGgnYFlVXVFVdwInAHsPH1BV36qq25uHZwFbttiPJEmSpJ5rMwBtAVw79Hh5s20qrwROa7EfSZIkST03f9wNACR5CbAIeOoU+xcDiwG23nrrDjuTJEmStCZpcwToOmCrocdbNtt+R5LdgEOAvarqjsmeqKqOqqpFVbVowYIFrTQrSZIkac3XZgA6G9g2yTZJ1gH2A5YMH5BkB+BjDMLPjS32IkmSJEntBaCquhs4CDgduAw4qaouSfKuJHs1h70f2BD4fJLzkyyZ4ukkSZIk6T5r9RqgqjoVOHXCtkOH7u/WZn1JkiRJGtbqG6FKkiRJ0lxiAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb1hAJIkSZLUGwYgSZIkSb3RagBKsnuSHyRZluTgSfbfL8mJzf7vJVnYZj+SJEmS+q21AJRkHnAEsAewHbB/ku0mHPZK4Oaq+gPgw8B72+pHkiRJktocAdoJWFZVV1TVncAJwN4Tjtkb+FRz/2TgGUnSYk+SJEmSeqzNALQFcO3Q4+XNtkmPqaq7gVuAB7bYkyRJkqQeS1W188TJvsDuVfWq5vFLgZ2r6qChYy5ujlnePP5Rc8xPJzzXYmBx8/BRwA/uQ2ubAT+d8ah2jbuHcde3h7lRfy70MO76c6GHcdefCz2Mu749zI36c6GHcdefCz2Mu/5c6GHc9edCD+OuPxs9PKyqFky2Y/59eNKZXAdsNfR4y2bbZMcsTzIf2AT42cQnqqqjgKNmo6kkS6tq0Ww81+raw7jr28PcqD8Xehh3/bnQw7jrz4Uexl3fHuZG/bnQw7jrz4Uexl1/LvQw7vpzoYdx12+7hzanwJ0NbJtkmyTrAPsBSyYcswR4WXN/X+Cb1daQlCRJkqTea20EqKruTnIQcDowDzimqi5J8i5gaVUtAT4BHJdkGfBzBiFJkiRJklrR5hQ4qupU4NQJ2w4duv9r4C/a7GESszKV7j4adw/jrg/2MBfqw/h7GHd9GH8P464P4+9h3PXBHuZCfRh/D+OuD+PvYdz1Yfw9jLs+jL+HcdeHFntobREESZIkSZpr2rwGSJIkSZLmlF4FoCS7J/lBkmVJDh5D/WOS3Ngs/925JFsl+VaSS5NckuT1Y+hh3STfT3JB08PhXffQ9DEvyXlJvjKm+lcluSjJ+UmWjqH+pklOTnJ5ksuSPKnj+o9qPvcVt1uTvKHjHt7Y/AxenOT4JOt2Wb/p4fVN/Uu6+vwnOw8leUCSryf5n+bf+3dc/y+ar8G9SVpfdWiKHt7f/H+4MMkpSTbtuP67m9rnJzkjyUPbqj9VD0P73pykkmzWZf0khyW5bui8sGdb9afqodn+2uZn4ZIk7+uyfpIThz7/q5Kc31b9aXp4fJKzVvx+SrLTGHr4oyT/3fye/HKSjVusP+lro67Oi9PU7+y8OE0PnZwXp6nf3nmxqnpxY7AQw4+AhwPrABcA23Xcw1OAHYGLx/Q1eAiwY3N/I+CHY/gaBNiwub828D1glzF8Ld4EfA74ypi+F1cBm42jdlP/U8CrmvvrAJuOsZd5wE8YrNffVc0tgCuB9ZrHJwEv7/jzfhxwMbA+g+sx/wP4gw7q/t55CHgfcHBz/2DgvR3XfwyD93g7E1g0pq/Bs4D5zf33juFrsPHQ/dcBH+36a9Bs34rB4kVXt3mOmuJrcBjwlra//zP08PTm/+L9msebd/09GNr/QeDQMXwNzgD2aO7vCZw5hh7OBp7a3H8F8O4W60/62qir8+I09Ts7L07TQyfnxWnqt3Ze7NMI0E7Asqq6oqruBE4A9u6ygar6DoPV7saiqn5cVec2938JXMbghWCXPVRV/ap5uHZz6/RCtCRbAs8Bju6y7lyRZBMGv3A+AVBVd1bVL8bY0jOAH1XV1R3XnQ+sl8F7kK0PXN9x/ccA36uq26vqbuDbwJ+3XXSK89DeDEIxzb/P67J+VV1WVfflDa5no4czmu8DwFkM3ruuy/q3Dj3cgJbPi9P8Pvow8LdjrN+ZKXo4EHhPVd3RHHNjx/UBSBLgBcDxbdWfpocCVoy4bELL58Ypengk8J3m/teBfVqsP9Vro07Oi1PV7/K8OE0PnZwXp6nf2nmxTwFoC+DaocfL6fjF/1ySZCGwA4MRmK5rz2uG9W8Evl5VXffwzwx+wd/bcd1hBZyR5JwkizuuvQ1wE3BsBtMAj06yQcc9DNuPln/JT1RV1wEfAK4BfgzcUlVndNkDg9GfP03ywCTrM/hL61YzfExbHlRVP27u/wR40Jj6mCteAZzWddEk/5DkWuDFwKEzHd9C/b2B66rqgq5rDzmomfJyTJtTMafxSAb/L7+X5NtJnjiGHgD+FLihqv5nDLXfALy/+Vn8APC2MfRwCb/9I/Vf0NG5ccJro87Pi+N8bTZCD52cFyfWb+u82KcApEaSDYEvAG+YkK47UVX3VNXjGfwlYackj+uqdpI/A26sqnO6qjmFJ1fVjsAewN8keUqHteczmG5wZFXtANzGYHi/cxm8SfJewOc7rnt/Br9ctwEeCmyQ5CVd9lBVlzGYUnAG8DXgfOCeLnuYTA3mGvR2edAkhwB3A5/tunZVHVJVWzW1D+qydhPC384YgteQI4FHAI9n8IeJD46hh/nAA4BdgLcCJzWjMV3bn47/MDTkQOCNzc/iG2lmC3TsFcBrkpzDYErUnW0XnO61URfnxXG/Npuuh67Oi5PVb+u82KcAdB2/+xeELZttvZJkbQY/XJ+tqi+Os5dm2tW3gN07LPsnwF5JrmIwDXLXJJ/psD7wmxGIFdMrTmEwRbMry4HlQyNvJzMIROOwB3BuVd3Qcd3dgCur6qaqugv4IvDHHfdAVX2iqp5QVU8BbmYw73kcbkjyEIDm39am/cxlSV4O/Bnw4uYFz7h8lhan/EzhEQz+IHBBc37cEjg3yYO7aqCqbmj+QHYv8HG6PS+usBz4YjNd+/sMZgq0thjEZJppuX8OnNhl3SEvY3BOhMEfpzr/PlTV5VX1rKp6AoMg+KM2603x2qiz8+JceG02VQ9dnRdH+BrM6nmxTwHobGDbJNs0f3XeD1gy5p461fwV6xPAZVX1oTH1sGDFKiJJ1gOeCVzeVf2qeltVbVlVCxn8DHyzqjr9y3+SDZJstOI+g4sMO1sZsKp+Alyb5FHNpmcAl3ZVf4Jx/ZXzGmCXJOs3/y+ewWDOcaeSbN78uzWDFzyf67qHxhIGL3po/v2/Y+pjbJLszmBq7F5VdfsY6m879HBvOjwvAlTVRVW1eVUtbM6PyxlclPyTrnpY8WKz8Xw6PC8O+RKDhRBI8kgGi8T8tOMedgMur6rlHddd4Xrgqc39XYHOp+ENnRvXAt4BfLTFWlO9NurkvDhHXptN2kNX58Vp6rd3XqxZWk1hdbgxmGP/QwZ/SThkDPWPZzCsfxeDXy6v7Lj+kxkM4V7IYLrN+cCeHfewPXBe08PFtLzCzQy9PI0xrALHYCXCC5rbJWP6WXw8sLT5PnwJuP8YetgA+BmwyZi+/4c3J9OLgeNoVn3quIf/ZBA+LwCe0VHN3zsPAQ8EvsHghc5/AA/ouP7zm/t3ADcAp4/ha7CMwXWiK86Nra3CNkX9LzQ/ixcCX2ZwAXCnX4MJ+6+i3VXgJvsaHAdc1HwNlgAPGcPPwTrAZ5rvxbnArl1/D4BPAn/d5uc+w9fgycA5zXnpe8ATxtDD6xm8Xvsh8B4gLdaf9LVRV+fFaep3dl6cpodOzovT1G/tvJimsCRJkiSt8fo0BU6SJElSzxmAJEmSJPWGAUiSJElSbxiAJEmSJPWGAUiSJElSbxiAJEmzKsnCJON4D5fhHl6X5LIk075zeZIzkyzqqi9J0vjNH3cDkiSNIsn8qrp7xMNfA+xW43szSUnSHOUIkCSpDfOSfDzJJUnOSLIeQJLHJzkryYVJTkly/2b7b0ZikmyW5Krm/suTLEnyTQZvSvg7krwpycXN7Q3Nto8yeMPh05K8ccLx6yU5oRkdOgVYb2jfkUmWNj0f3mzbNcmXho55ZvNxkqTVlAFIktSGbYEjquqxwC+AfZrtnwb+rqq2By4C3jnCc+0I7FtVTx3emOQJwAHAzsAuwKuT7FBVfw1cDzy9qj484bkOBG6vqsc0tZ8wtO+QqloEbA88Ncn2wLeARydZ0BxzAHDMCD1LkuYoA5AkqQ1XVtX5zf1zgIVJNgE2rapvN9s/BTxlhOf6elX9fJLtTwZOqarbqupXwBeBP53huZ4CfAagqi4ELhza94Ik5wLnAY8FtquqAo4DXpJkU+BJwGkj9CxJmqO8BkiS1IY7hu7fw9BUsynczW//KLfuhH23zVZTU0myDfAW4IlVdXOSTw71cSzwZeDXwOdX4jokSdIc5AiQJKkTVXULcHOSFaM0LwVWjAZdxW+no+074lP+J/C8JOsn2QB4frNtOt8BXgSQ5HEMprsBbMwgaN2S5EHAHkN9X89gSt07GIQhSdJqzBEgSVKXXgZ8NMn6wBUMrqkB+ABwUpLFwFdHeaKqOrcZqfl+s+noqjpvhg87Ejg2yWXAZQym51FVFyQ5D7gcuBb47oSP+yywoKouG6U3SdLclcH0ZkmSNJUkHwHOq6pPjLsXSdJ9YwCSJGkaSc5hMD3umVV1x0zHS5LmNgOQJEmSpN5wEQRJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQbBiBJkiRJvWEAkiRJktQb/x/Hxpf52XAk8QAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 1008x360 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(14, 5))\n",
    "\n",
    "sns.barplot(x=df_per_hour.pickup_hour.values, y=df_per_hour.tip_amount.values)\n",
    "plt.title('Mean tip amount')\n",
    "plt.xlabel('hour of day')\n",
    "plt.ylabel('mean tip amount')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Больше примеров можно найти [в документации](https://docs.vaex.io/en/latest/tutorial.html)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
